quest software tools for uncertainty propagation … software tools for uncertainty propagation and...

1
QUEST Software Tools for Uncertainty Propagation and Inference Michael Eldred 1 , David Higdon 2 , Ernesto Prudencio 3 , Bert Debusschere 1 1 Sandia National Laboratories, 2 Los Alamos National Laboratory, 3 The University of Texas at Austin Support for this work was provided through the Scientific Discovery through Advanced Computing (SciDAC) project funded by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. www.quest-scidac.org UQ in Chemical Systems DAKOTA (SNL) (dakota.sandia.gov) is a C++ application that provides a variety of non-intrusive algorithms for design optimization, model calibration, uncertainty quantification, global sensitivity analysis, parameter studies, and solution verification. It can be used as either a stand-alone application or as a set of library services, and supports multiple levels of parallelism for scalability on both capability and capacity HPC resources. • Contact: [email protected] DAKOTA Input File Strategy Method Model: Variables, Interface, Responses DAKOTA Output Files Raw data (all x- and f-values) Sensitivity info Statistics on f-values Optimality info CALORE thermal analysis ALEGRA shock physics SALINAS structural dynam Premo high speed flow (your code here) Code Input Code Output DAKOTA Parameters File {x1 = 123.4} {x2 = -33.3}, etc. Use APREPRO/DPREPRO to cut-and-paste x-values into code input file User-supplied automatic post-processing of code output data into f-values DAKOTA executes sim_code_script to launch a simulation job DAKOTA Results File 999.888 f1 777.666 f2, etc. DAKOTA Executable Sensitivity Analysis, Optimization, Uncertainty Quantification, Parameter Estimation GPMSA (LANL) (www.stat.lanl.gov/source/orgs/ccs/ccs6/gpmsa/gpmsa.html ) is a MATLAB toolbox that focuses on Bayesian inference using a Gaussian process response surface, trained from an ensemble of forward model runs, to minimize the number of forward model calls required in the inference. It allows for global sensitivity analysis, forward propagation of uncertainty, model calibration/parameter estimation, and predictions with uncertainty. Contact: David Higdon, [email protected] Dave Higdon, Jim Gattiker, Brian Williams, Maria Rightley, “Computer Model Calibration Using High Dimensional Output,” Journal of the American Statistical Association, Volume 103, Issue 482, 2008, pp. 570-583. Inputs x controllable t uncertain physics best, unknown value of t y(x) = (x) + (x) (x) = (x, ) + (x) field data reality observation error computer model discrepancy Basic steps in calibration analysis: 1. Assume initial probability dist’n for physics uncertainties . 2. Calibrate parameters to field data and simultaneously infer model discrepancies. Code predictions based on initial uncertainty in Code predictions based on calibrated Predictions of discrepancy Predictions of reality prior calibrated UQTk (SNL) (www.sandia.gov/UQToolkit): A library of C++ and Matlab functions for propagation of uncertainty through computational models Mainly relies on spectral Polynomial Chaos Expansions (PCEs) for representing random variables and stochastic processes Complementary to production tools, UQTk targets: Rapid prototyping Algorithmic research Outreach: Tutorials / Educational Version 1.0 released under the GNU LGPL Utilities for intrusive stochastic Galerkin New release in Fall 2012 Nonintrusive quadrature-based methods Bayesian inference Matlab version Source code, tutorials, examples available on website Contact: Bert Debusschere: [email protected] A typical application software stack. QUESO requires the input of a likelihood routine for statistical inverse problems and of a QoI routine for statistical forward problems. These application level routines provide the bridge among the statistical algorithms in QUESO, the model knowledge in the model library, and relevant model specific data available for calibration or validation. QUESO (UT) is an MPI/C++ library that provides statistical algorithms for Bayesian inference, model calibration, model validation, and decision making under uncertainty. It naturally maps, into C++ classes, the mathematical entities present in stochastic problems and solution methods, thus enabling the easy integration of new algorithms. Contact: Ernesto Prudencio, [email protected] E. E. Prudencio and K. W. Schulz. The parallel C++ statistical library QUESO: Quantification of Uncertainty for Estimation, Simulation and Optimization. In M. Alexander et al., editors, Euro-Par 2011 Workshops, Part I, volume 7155 of Lecture Notes in Computer Science, pages 398-407. Springer-Verlag, Berlin Heidelberg, 2012. Process of quantifying the effect of uncertainties typically includes: (Global) sensitivity analysis: identification of input set with greatest influence on output QoIs Uncertainty characterization: fit or infer from observable data; parametric/non-parametric/KDE Uncertainty propagation: input distributions output QoI distributions Decision making: model validation, prediction (interpolation, extrapolation), design QUEST software tools support a range of: UQ studies: sensitivity analysis, uncertainty propagation, statistical inference Environments: rapid prototyping in production computing in compiled interpreted languages languages on parallel platforms Intrusion: embedded linked black box An interoperable set of tools that can be tailored: GPMSA C++ version in QUESO; DAKOTA + QUESO + emulators Production deployment of stable capabilities in frameworks Close collaboration of SAPs with library developers for custom capabilities Iterator Model Strategy: control of multiple iterators and models Iterator Model Iterator Model Coordination: Nested Layered Cascaded Concurrent Adaptive/Interactive Parallelism: Asynch local Message passing Hybrid Nested scheduling Master-slave/dynamic Peer/static Parameters Model: Design continuous discrete Uncertain normal/logn uniform/logu triangular exp/beta/gamma EV I, II, III histogram interval State continuous discrete Application system fork direct grid Approximation global polynomial 1/2/3, NN, kriging, MARS, RBF multipoint – TANA3 local – Taylor series multifidelity ROM Functions objectives constraints least sq. terms generic Responses Interface Parameters LHS/MC Iterator Optimizer ParamStudy COLINY NPSOL DOT OPT++ LeastSq DACE GN Vector MultiD List DDACE CCD/BB UQ Reliability IntEst/Evid JEGA CONMIN NLSSOL NL2SOL QMC/CVT Gradients numerical analytic Hessians numerical analytic quasi NLPQL Center PCE/SC Strategy Uncertainty LeastSq Hybrid SurrBased OptUnderUnc Branch&Bound/PICO Optimization 2ndOrderProb UncOfOptima Pareto/MStart ModelCalUnderUnc Dropping a bowling ball from a tower The time it takes a bowling to fall from different heights is recorded for drops of 10, 20, …, 50 meters – a validation experiment dropping a ball from 60m is also conducted. The uncertainty in the measured drop times is normally distributed about the true time, with a standard deviation of .1 seconds. The QOI is the drop time for the bowling ball at a height of 100m. Since the tower is only 60m high, a computational model is used to help make this assessment. The conceptual model incorporates only acceleration due to gravity g, allowing the computational solution used here to be compared to an analytical solution for a verification assessment of bowling ball drop times between 10 and 100m. The physical constant g is assumed to be unknown, but between 8 and 12 m/s 2 (light lines). The five drop time measurements (black dots) constrain the uncertainty about g to the probability density given by the dark line. The drop times are produced by 11 computational model runs (light lines in (c) and (d)), each at different values of g. GPMSA combines these simulations with experimental data to constrain g (dark lines in (c)) and to produce predictions with uncertainty for drops from heights above 50m (dark lines (d)). Initial and constrained uncertainty for g (a) (d) (c) (b) Specifically, predictions with uncertainty for a validation experiment – a drop of the bowling ball can be produced with GPMSA, as well as for a drop of the bowling ball at 100m. (e) After calibrating the model, which only accounts for acceleration due to gravity g, we find that the model does not accurately predict drop times for the basketball and baseball. Because of this, a model discrepancy term is added to the GPMSA analysis. Simple case study: dropping balls from a tower Can get field data from tossing objects off of floors 1-6. Have computa onal model which predicts drop mes given ball radius, density, and flo r. Computa onal model has parameter for air fric on which depends on cross sec on, density and velocity. Have baseball, basketball, golf ball, tennis, light & heavy bowling balls. Want to predict so ball drop me from 10 th flo o r (100m). Also want to understand the value of various types of poten al experiments & simula ons for the so ball predic on at 100m. Slide 1 radius density physics design space golf baseball tennis so ball basketball light bowling bowling Figure (d) shows predictions using the discrepancy adjusted model described above. The added uncertainty is due to uncertainty in both g and α in this model. (a) (c) (b) Using a model discrepancy term to predict drop times (d) A discrepancy adjusted prediction is produced by adjusting the simulated drop times according to: drop time = simulated drop time + α × drop height where α depends on the radius and density of the ball (R ball ball ). The model produces an estimate for α that increases as ball density decreases (Figure (c)). This is accomplished by adding a linear discrepancy basis term into the GPMSA formulation. JAGUAR Adams, B.M., Bohnhoff, W.J., Dalbey, K.R., Eddy, J.P., Eldred, M.S., Gay, D.M., Haskell, K., Hough, P.D., and Swiler, L.P., "DAKOTA, A Multilevel Parallel Object-Oriented Framework for Design Optimization, Parameter Estimation, Uncertainty Quantification, and Sensitivity Analysis: Version 5.2 User's Manual," Sandia Technical Report SAND2010-2183, updated Nov. 2011.

Upload: trinhphuc

Post on 26-Jun-2018

231 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: QUEST Software Tools for Uncertainty Propagation … Software Tools for Uncertainty Propagation and Inference ... Bert Debusschere1 1Sandia National Laboratories, ... Brian Williams,

QUEST Software Tools for Uncertainty Propagation and Inference Michael Eldred1, David Higdon2, Ernesto Prudencio3, Bert Debusschere1

1Sandia National Laboratories, 2Los Alamos National Laboratory, 3The University of Texas at Austin Support for this work was provided through the Scientific Discovery through Advanced Computing (SciDAC) project funded by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

www.quest-scidac.org

UQ in Chemical Systems

DAKOTA (SNL) (dakota.sandia.gov) is a C++ application that provides a variety of non-intrusive algorithms for design optimization, model calibration, uncertainty quantification, global sensitivity analysis, parameter studies, and solution verification. It can be used as either a stand-alone application or as a set of library services, and supports multiple levels of parallelism for scalability on both capability and capacity HPC resources.

• Contact: [email protected] DAKOTA Input File

• Strategy

• Method

• Model: Variables,

Interface, Responses

DAKOTA Output Files

• Raw data (all x- and f-values)

• Sensitivity info

• Statistics on f-values

• Optimality info

CALORE thermal analysis ALEGRA shock physics SALINAS structural dynam Premo high speed flow (your code here)

Code

Input

Code

Output

DAKOTA Parameters File {x1 = 123.4}

{x2 = -33.3}, etc.

Use APREPRO/DPREPRO

to cut-and-paste x-values

into code input file

User-supplied automatic

post-processing of code

output data into f-values

DAKOTA executes sim_code_script

to launch a

simulation job

DAKOTA Results File 999.888 f1

777.666 f2, etc.

DAKOTA Executable

Sensitivity Analysis,

Optimization, Uncertainty

Quantification, Parameter

Estimation

GPMSA (LANL) (www.stat.lanl.gov/source/orgs/ccs/ccs6/gpmsa/gpmsa.html) is a MATLAB toolbox that focuses on Bayesian inference using a Gaussian process response surface, trained from an ensemble of forward model runs, to minimize the number of forward model calls required in the inference. It allows for global sensitivity analysis, forward propagation of uncertainty, model calibration/parameter estimation, and predictions with uncertainty.

• Contact: David Higdon, [email protected]

Dave Higdon, Jim Gattiker, Brian Williams, Maria Rightley, “Computer Model Calibration Using High Dimensional Output,” Journal of the American Statistical Association, Volume 103, Issue 482, 2008, pp. 570-583.

Inputs x controllable

t uncertain physics best, unknown value of t

y(x) = (x) + (x)

(x) = (x, ) + (x)

field data reality

observation error

computer model discrepancy

Basic steps in calibration analysis: 1. Assume initial probability dist’n for physics uncertainties . 2. Calibrate parameters to field data and simultaneously infer model discrepancies.

Code predictions based on initial uncertainty in

Code predictions based on calibrated

Predictions of discrepancy Predictions of reality

prior calibrated

UQTk (SNL) (www.sandia.gov/UQToolkit): • A library of C++ and Matlab functions for propagation of

uncertainty through computational models • Mainly relies on spectral Polynomial Chaos Expansions (PCEs)

for representing random variables and stochastic processes • Complementary to production tools, UQTk targets:

• Rapid prototyping • Algorithmic research • Outreach: Tutorials / Educational

• Version 1.0 released under the GNU LGPL • Utilities for intrusive stochastic Galerkin

• New release in Fall 2012 • Nonintrusive quadrature-based methods • Bayesian inference • Matlab version

• Source code, tutorials, examples available on website • Contact: Bert Debusschere: [email protected]

A typical application software stack. QUESO requires the input of

a likelihood routine for statistical inverse problems and of a QoI

routine for statistical forward problems. These application level

routines provide the bridge among the statistical algorithms in

QUESO, the model knowledge in the model library, and relevant

model specific data available for calibration or validation.

QUESO (UT) is an MPI/C++ library that provides statistical algorithms for Bayesian inference, model calibration, model validation, and decision making under uncertainty. It naturally maps, into C++ classes, the mathematical entities present in stochastic problems and solution methods, thus enabling the easy integration of new algorithms.

• Contact: Ernesto Prudencio, [email protected]

E. E. Prudencio and K. W. Schulz. The parallel C++ statistical library QUESO: Quantification of Uncertainty for Estimation, Simulation and Optimization. In M. Alexander et al., editors, Euro-Par 2011 Workshops, Part I, volume 7155 of Lecture Notes in Computer Science, pages 398-407. Springer-Verlag, Berlin Heidelberg, 2012.

Process of quantifying the effect of uncertainties typically includes: • (Global) sensitivity analysis: identification of input set with greatest influence on output QoIs • Uncertainty characterization: fit or infer from observable data; parametric/non-parametric/KDE • Uncertainty propagation: input distributions output QoI distributions • Decision making: model validation, prediction (interpolation, extrapolation), design

QUEST software tools support a range of: • UQ studies: sensitivity analysis, uncertainty propagation, statistical inference • Environments: rapid prototyping in production computing in compiled

interpreted languages languages on parallel platforms • Intrusion: embedded linked black box

An interoperable set of tools that can be tailored: • GPMSA C++ version in QUESO; DAKOTA + QUESO + emulators • Production deployment of stable capabilities in frameworks • Close collaboration of SAPs with library developers for custom capabilities

Iterator

Model

Strategy: control of multiple iterators and models

Iterator

Model

Iterator

Model

Coordination: Nested Layered Cascaded Concurrent Adaptive/Interactive

Parallelism: Asynch local Message passing Hybrid Nested scheduling Master-slave/dynamic Peer/static

Parameters Model:

Design continuous discrete Uncertain normal/logn uniform/logu triangular exp/beta/gamma EV I, II, III histogram interval State continuous discrete

Application system fork direct grid Approximation global polynomial 1/2/3, NN, kriging, MARS, RBF multipoint – TANA3 local – Taylor series multifidelity ROM

Functions objectives constraints least sq. terms generic

Responses Interface Parameters

LHS/MC

Iterator

Optimizer ParamStudy

COLINY NPSOL DOT OPT++

LeastSq DACE GN

Vector MultiD

List

DDACE CCD/BB

UQ

Reliability

IntEst/Evid

JEGA CONMIN

NLSSOL NL2SOL QMC/CVT

Gradients numerical analytic

Hessians numerical analytic quasi NLPQL

Center PCE/SC

Strategy

Uncertainty LeastSq

Hybrid

SurrBased OptUnderUnc

Branch&Bound/PICO

Optimization

2ndOrderProb

UncOfOptima Pareto/MStart

ModelCalUnderUnc

Dropping a bowling ball from a tower The time it takes a bowling to fall from different heights is recorded for drops of 10, 20, …, 50 meters – a validation experiment dropping a ball from 60m is also conducted. The uncertainty in the measured drop times is normally distributed about the true time, with a standard deviation of .1 seconds. The QOI is the drop time for the bowling ball at a height of 100m. Since the tower is only 60m high, a computational model is used to help make this assessment.

The conceptual model incorporates only acceleration due to gravity g, allowing the computational solution used here to be compared to an analytical solution for a verification assessment of bowling ball drop times between 10 and 100m.

The physical constant g is assumed to be unknown, but between 8 and 12 m/s2 (light lines). The five drop time measurements (black dots) constrain the uncertainty about g to the probability density given by the dark line. The drop times are produced by 11 computational model runs (light lines in (c) and (d)), each at different values of g. GPMSA combines these simulations with experimental data to constrain g (dark lines in (c)) and to produce predictions with uncertainty for drops from heights above 50m (dark lines (d)).

Initial and constrained uncertainty for g

(a)

(d) (c)

(b)

Specifically, predictions with uncertainty for a validation experiment – a drop of the bowling ball can be produced with GPMSA, as well as for a drop of the bowling ball at 100m.

(e)

After calibrating the model, which only accounts for acceleration due to gravity g, we find that the model does not accurately predict drop times for the basketball and baseball. Because of this, a model discrepancy term is added to the GPMSA analysis.

Simplecasestudy:droppingballsfromatower

• Cangetfielddatafromtossingobjectsoffoffloors1-6.

• Havecomputa onalmodelwhichpredictsdrop mesgivenballradius,density,andflo

o

r .

• Computa onalmodelhasparameterforairfric onwhichdependsoncrosssec on,densityandvelocity.

• Havebaseball,basketball,golfball,tennis,light&heavybowlingballs.

• Wanttopredictso balldrop

mefrom10thfloo r (100m).

• Alsowanttounderstandthe

valueofvarioustypesofpoten alexperiments&simula onsfortheso ballpredic onat100m.

Slide1

radius

density

physicsdesignspace

golf

baseball

tennis

so ball

basketball

lightbowling

bowling

Figure (d) shows predictions using the discrepancy adjusted model described above. The added uncertainty is due to uncertainty in both g and α in this model.

(a)

(c)

(b)

Using a model discrepancy term to predict drop times

(d)

A discrepancy adjusted prediction is produced by adjusting the simulated drop times according to:

drop time = simulated drop time + α × drop height where α depends on the radius and density of the ball (Rball,ρball). The model produces an estimate for α that increases as ball density decreases (Figure (c)). This is accomplished by adding a linear discrepancy basis term into the GPMSA formulation.

JAGUAR

Adams, B.M., Bohnhoff, W.J., Dalbey, K.R., Eddy, J.P., Eldred, M.S., Gay, D.M., Haskell, K., Hough, P.D., and Swiler, L.P., "DAKOTA, A Multilevel Parallel Object-Oriented Framework for Design Optimization, Parameter Estimation, Uncertainty Quantification, and Sensitivity Analysis: Version 5.2 User's Manual," Sandia Technical Report SAND2010-2183, updated Nov. 2011.