introduction to computer experiments, part 2 - department of statistics

IntroductionConclusions

Introduction to Computer Experiments, Part 2

Thomas Santner

Department of StatisticsThe Ohio State University

Columbus, Ohio

September 11, 2013

TJ Santner Introduction to Computer Experiments, Part 2


Outline

Introduction

Conclusions



Overview

This talk will describe

• An example of a computer simulator that can used asexperimental tools.

• To study the output of a complex computer simulator, arapidly-computable emulator of the output of the simulator isordinarily used. Such an emulator, sometimes called a metamodel,

• A method for assessing (one type of) emulator predictionuncertainty.

• Emulators are the basis for the construction of (1) criteria-basedexperimental designs, (2) for calibration methodology, and (3) asmotivation for extensions of the model that allow prediction whenthe inputs are both quantitative and qualitative.



Recall

◮ Physical Experiments are Gold standard for establishingcause and effect relationships

◮ (Deterministic) Computer Experiments Use a computersimulator to relate inputs/outputs rather than a physicalexperiment. In use for at least 15-20 years; mostmethodological developments in the past 10 years

◮ Stochastic Simulation Experiments Complex physicalsystem each of whose parts behave in a stochastic manner butwhose ensemble behavior is not understood analytically.Heavily used in Industrial Engineering and OperationsResearch–e.g., compare job shop set ups

◮ Combinations of the above particularly ComputerExperiments + Physical Experiments



Rationale for Conducting Computer Experiments

◮ Sometimes it is not feasible to perform a physical experiment

1. Too expensive to study directly (too many input variables,physical process is technically too difficult, . . . )

2. Ethical considerations

◮ If the physical process relating the inputs to the response(s)

a. Can be described by a mathematical model relating theoutput, y(x), to the inputs x ,

b. Numerical methods exist for solving the mathematical model,c. The numerical methods can be implemented with computer

code (in reasonable time!)

Then one can run the computer code to produce a “response”y(x) at any input x , i.e., one can conduct a computerexperiment



Caveats–Computer Experiments

◮ Mathematical models often coupled systems of PDEs

◮ Numerical methods FE, CFD algorithms

◮ Running Times seconds to months



Example of a Simulator

• Consider dropping an object from a height h (≡ the input) andmeasuring the time until it hits the ground (≡ y(h)).




• (High School) Newton Physics Model relating h and y(h). Inmore familiar physics notation, let s(t) denote the height of theobject at time t. Then our model states s(0) = h (object dropped

from height h at time zero) and s(t)dt

∣

∣

∣

t=0= 0 (object is dropped

not thrown). Our simulator value at h is y c(h) = τ where τ is thesolution of s(τ) = 0 and s(t) satisfies (the equation of motion)

s2(t)

dt2= −g .




• Newton Physics Model with Drag for relating h and y(h)y c(h) = τ where τ is the solution of s(τ) = 0 and s(t) satisfies

s2(t)

dt2= −g + d ×

s(t)

dt

subject to s(0) = h and s(t)dt

∣

∣

∣

t=0= 0 (Note: s(τ, d) = 0 if we

don’t know the coefficient of drag-a familiar situation)



In Sum

The output of a computer simulator relating x and y(x) can beviewed as a black-box process

x −→ Simulator −→ y(x)

(grey-box codes . . . )



Substantive Applications of Computer Simulators

• Policy Planning –the Wonderlandmodel (41 inputs) describesglobal economic and environmental scenarios. Wonderland has 41inputs detailing population growth, economic activity in developedand undeveloped areas, etc and an output which is a weightedmeasure of human development that takes into account

◮ Net output per capita (output minus environ control costs)

◮ Death rates

◮ Annual flow of pollutants

◮ “Carrying capacity”




• Industrial Applications

1. Design of VLSI circuits

2. Design engines and other automobile components (Fang, Li,and Sudjianto, 2005)

3. Determine optimum operating conditions for a compressionmolding process

4. Design of jet engines, helicopter rotor blades

• Environmental Science NIST codes for the temporalevolution of contained and wild fires




• Cosmology determination of cosmological and computer modelparameters.1. Habib, Heitmann, Higdon, Nakhleh, and Williams (2006)Cosmic Calibration: Constraints from the Matter Power Spectrumand the Cosmic Microwave Background, LANL Technical Report,LA-UR-07-00562. Heitmann, Higdon, Nakhleh, and Habib (2006) CosmicCalibration, LANL Technical Report, LA-UR-06-2320• Biomechanics



Inputs to a Computer Experiments

• Types of Inputs x = (xd , xe , xc , x t)

◮ xd ≡ engineering design (manufacturing, treatment, control)variables

◮ xe ≡ noise (field, environmental) input variables

◮ xc ≡ calibration (model) variables – if observational orexperimental data are available in addition to code outputs,then xc inputs are those code inputs whose values in theobserved data are unknown (eg, friction, rates of metabolism,rate of expansion of the galaxy)

◮ x t ≡ tuning parameters are present only in the computercode–they are used to make the bias in the computer outputas small as possible.



Caveats–Inputs

• Usually only some of the xd ,xe ,xc ,x t types are present inapplications• Target Field Conditions X e ∼ πe(·) may be given or can besolicited• Prior Information Regarding calibration parametersX c ∼ πc(·) may be known



An Example from Biomechanics

In his Cornell PhD thesis, KevinOng conducted an uncertaintyanalysis of the effects of Engi-neering Cup design, Surgical,Patient variables on the Stabil-ity of Uncemented AcetabularComponents



An Example from Biomechanics (cont)



Example-Inputs



Output of a Computer Experiments

The output of a Computer Experiment has the following features• y(x) is deterministic• y(x) may be biased for the physical relationship that it issupposed to describe (inclomplete physics, numerical issues)• In practice

◮ Real-valued y(x)

◮ Multivariate (y1(x), . . . , yk(x))

◮ Functional (t, y(t, x))



Biomechanics Example (cont)

• Three related outputs related to the amount of material that willeventually accumulate behind the acetabular cup

1. Total contact surface area

2. Rim contact surface area

3. Change in the bone-implant gap volume.



Aspects in Designing Computer Experiments

• Our interest in settings where

1. Few computer runs are possible - codes are complex e.g.,fine-grid FEA codes

2. High-dimensional input x

• Traditional DoE principles are irrelevant–no nuisance orunrecognized factors.• Sometimes output from an associated physical experiment is alsoavailable, but in many cases

1. Physical experiments are available only for components of theensemble process, eg, code that emulates an auto crash test.

2. Experiments that only approximate reality are available, e.g.,Instron-Stanmore knee simulator

3. Only observational data are available, e.g., In Cosmology –only SDSS data



Biomechanics Example (cont)



Conceptualizing Experimental Output

• Contrasting output from a physical experiment (or observationstudy) with output from a computer experiment• Output obtained from a physical experiment is a noisymeasurement of the true input-output relationship, i.e.,

yP(x) = µT (x) + ǫ(x)

1. x −→ µT (x) ≡ true input-output relationship

2. {ǫ(x)}x ≡ measurement error (often modeled as i.i.d N(0, σ2ǫ)

a.k.a. white noise)



Conceptualizing Computer Experiments

• Output from a computer experiment is a possibly biaseddescription of the true input-output relationship (inadequatephysics, biology, . . . )

(δ(x) ≡ y c(x)− µT (x)) or y c(x) = µT (x) + δ(x)

where

1. δ(x) ≡ computer model bias

2. µT (x) is the true input-output relationship



A Classification of Problems

• Interpolation/Prediction Given output of computer code at aset of training inputs,

(x t1, yc(x t1)), . . . (x

tm, y

c(x tm))

predict y c(·) at a new input x0. An extended version of thisobjective is to predict µT (x) based on training data from thecomputer simulator and an associated physical experiment.• Assess Prediction Accuracy Using the data from both aphysical experimental data and a (calibrated) computerexperiment, give uncertainty bounds for the predicted value ofy c(x) or µT (x) of an associated physical system.




• Experimental design Determine a set of inputs at which tocarry out the sequence of code runs. (a “good” design of aphysical or computer experiment depends on the scientificobjective of the research)

◮ Exploratory Designs (geometric “space-filling”)

◮ Designs that yield good overall prediction

◮ Designs to find optimal inputs (find xoptd ≡ argmin y(x))




• Uncertainty/Output Analysis Determine the distribution ofthe random variable y c(xd ,X e), i.e., determine the variability inthe performance measure y c(·) for design xd when applied to thepopulation defined by the distribution of X e , eg., patient specificvariables (patient weight or bone material properties) or surgeonspecific variables (measuring surgical skill)• Calibration Given outputs from a computer simulatory c(xd , xc) where the information about xc is described by the(prior) distribution πc(xc) and also from a physical experimentyp(xd ), refine the prior to a posterior distribution for xc .




• Sensitivity Analysis Determine the important (unimportant)input variables, i.e., determine those xi of x = (x1, . . . , xd ) thaty c(x) (or µT (x)) is most (least) sensitive to changes in?

Philosophy Inputs that have relatively little effect on the outputcan be set to some nominal value; additional investigation can berestricted to determining how the output depends on the activeinputs• Set Tuning Parameters for the computer code (FEA−−mesh density, discretization of continuous functional inputs,solution tolerances, . . . )



Biomechanics Example (cont)-Sensitivity Analysis



Outline

Introduction

Conclusions



Summary and Discussion

• Many of the Problems 1-7 have “natural” solutions obtained byapproximating y(x), by a fast (i.e., linear in the training data)predictor, a metamodel• Statisticians use a Bayesian approach to produce fast predictorsfor “smooth” computer codes, based on a stationary Gaussianstochastic process model (or more complex non-stationary models,eg., TGP, CGM, . . . ). The posterior distribution of the processgives both a predictor and error estimate due to model uncertainty.• In addition to predictions of the code at “new” locations,stationary Gaussian stochastic processes can be used to produceexperimental designs for criteria-based objectives, to performcalibration, and tuning



Questions?? Discussion


introduction to computer experiments, part 2 - department of statistics

Documents