kleijnen - a methodology for fitting and validating metamodels in simulation

7/29/2019 Kleijnen - A Methodology for Fitting and Validating Metamodels in Simulation

1/16

Theory and Methodology

A methodology for tting and validating metamodels in

simulation 1

Jack P.C. Kleijnen a,*, Robert G. Sargent b

a

Department of Information Systems (BIK)/Center for Economic Research, Tilburg University (KUB), 5000 LE Tilburg,The Netherlands

b Simulation Research Group, Department of Electrical Engineering and Computer Science, 439 Link Hall, Syracuse University,

Syracuse, NY 13244, USA

Received 1 December 1997; accepted 1 October 1998

Abstract

This paper proposes a methodology that replaces the usual ad hoc approach to metamodeling. This methodology

considers validation of a metamodel with respect to both the underlying simulation model and the problem entity. It

distinguishes between tting and validating a metamodel, and covers four types of goal: (i) understanding, (ii) pre-

diction, (iii) optimization, and (iv) verication and validation. The methodology consists of a metamodeling processwith 10 steps. This process includes classic design of experiments (DOE) and measuring t through standard measures

such as R-square and cross-validation statistics. The paper extends this DOE to stagewise DOE, and discusses several

validation criteria, measures, and estimators. The methodology covers metamodels in general (including neural net-

works); it also gives a specic procedure for developing linear regression (including polynomial) metamodels for

random simulation. 2000 Elsevier Science B.V. All rights reserved.

Keywords: Simulation; Approximation; Response surface; Modeling; Regression

1. Introduction

Although metamodels have been quite fre-

quently studied in the simulation literature and

applied in the simulation practice, this has been

done in an ad hoc way. Therefore in this paper we

oer a methodology for developing metamodels.Our methodology has the following novel char-

acteristics: (i) We strictly distinguish between t-

ting and validating a metamodel. (ii) We

emphasize the role of the problem entity besides

the metamodel and the simulation model. (iii) We

assume that any model should be developed for a

specic goal; for metamodels we distinguish

four types of goal. Guided by these three charac-

teristics, we derive a metamodeling process with

10 steps. We also give a detailed procedure for

European Journal of Operational Research 120 (2000) 1429www.elsevier.com/locate/orms

* Corresponding author. Tel.: +31-13-466-2029; fax: +31-13-

466-3377; e-mail: [email protected] Two anonymous referees' comments on the rst draft lead

to an improved organization of our paper.

0377-2217/00/$ see front matter 2000 Elsevier Science B.V. All rights reserved.

PII: S 0 3 7 7 - 2 2 1 7 ( 9 8 ) 0 0 3 9 2 - 0


2/16

validating metamodels that use linear regression

analysis in random simulation; we feel that this

procedure is suited to all four goals of metamod-

eling.

Our methodology is meant for both researchers

and practitioners. We assume that those readers

have a basic knowledge of simulation and mathe-

matical statistics, especially regression analysis.

The characteristics of our methodology are de-

tailed as follows.

We distinguish the relationships among meta-

model, simulation model, and problem entity that

are displayed in Fig. 1, where w.r.t. stands for

`with respect to'. A problem entity is some system

(real or proposed), idea, situation, policy, or phe-nomenon that is being modeled. A simulation

model is a causal model of some problem entity;

this model may be deterministic or stochastic. A

metamodel is an approximation of the input/out-

put (I/O) transformation that is implied by the

simulation model; the resulting black-box model is

also known as a response surface. There are dif-

ferent types of metamodels; for example, polyno-

mial regression models (which are a type of linear

regression), splines (which partition the domain of

applicability into subdomains and t simple re-gression models to each of the subdomains), and

neural networks (a type of non-linear regression);

see Barton (1993, 1994), Friedman (1996), Huber

et al. (1996), Kleijnen (1999), Pierreval (1996), Yu

and Popplewell (1994).

Validation is the `substantiation that a model

within its domain of applicability possesses a sat-

isfactory range of accuracy consistent with the

intended application of the model'; see Sargent

(1996) and Schlesinger et al. (1979). Fig. 1 shows

that validation of a metamodel relates to both the

simulation model and the problem entity. To val-

idate a metamodel, we must know the accuracy

required of the metamodel; this depends on the

goal of the metamodel. We do not discuss valida-

tion of the underlying simulation model, except

when discussing the use of a metamodel to aid in

this validation (for validation of simulation mod-els see Beck et al., 1997; Kleijnen, 1995b; Sargent,

1996 and http://manta.cs.vt.edu/biblio/.) So we

assume that the simulation model under consid-

eration is valid, unless stated otherwise.

Fitting applies mathematical and statistical

techniques to a set of simulation I/O data, to (i)

estimate the metamodel's parameter values, and

(ii) evaluate the estimated parameter values with

respect to the data set, using quantitative criteria.

Prior to tting a metamodel, we must specify the

type and form (or class) of the metamodel.Whereas validation requires (domain) knowledge

about the problem entity and the specied accu-

racy required of the metamodel, tting concerns

only the determination of a `good' set of parameter

values from the data set. So tting is not concerned

with whether the metamodel adequately describes

the problem entity and the simulation model, given

the goal of the metamodel.

For metamodels we identify four general goals:

(i) understanding the problem entity, (ii) predicting

values of the output or response variable, (iii)

performing optimization, and (iv) aiding the veri-cation and validation (V & V) of a simulation

model. These goals need attention when specifying

the type and form of metamodel to be tted and

when determining the validity of the metamodel.

The study of a problem entity usually distin-

guishes several outputs. Hence, the corresponding

simulation also has multiple response variables.

Current practice, however, uses metamodels with a

single output: a separate metamodel is developed

for each output. Indeed we develop a methodologyFig. 1. Metamodel, simulation model, and problem entity.

J.P.C. Kleijnen, R.G. Sargent / European Journal of Operational Research 120 (2000) 1429 15


3/16

for a single output; this methodology can be ap-

plied to each of the single-response metamodels.

We use dierent symbols for the I/O data of

problem entity, simulation, and metamodel: see

Fig. 2. We shall return to the individual symbols.

We organize the remainder of this paper as

follows. Section 2 covers our methodology for

developing metamodels, consisting of a process

with 10 steps. Section 3 details some steps. Sec-

tion 4 focuses on linear-regression metamodels in

random simulation. Section 5 contains research

issues. Section 6 provides a summary.

2. The metamodeling process

We suggest the following 10 steps for develop-

ing a metamodel.

1. Determine the goal of the metamodel: Let us

reconsider the four goals mentioned above. (a)

Problem entity understanding: A goal may be un-

derstanding the internal behavior of a problem

entity (e.g., determining the `bottlenecks' in a

system), understanding the behavior of the output,

performing sensitivity analysis (i.e., determining

how sensitive the output is to changes to the in-puts), and conducting whatif studies. (b) Predic-

tion: The metamodel may replace the simulation,

to obtain a set of values of the output for a specic

set of values of the inputs. The metamodel is then

used routinely instead of the simulation, since the

metamodel is usually quicker and easier to use

than the simulation. (c) Optimization: The meta-

model may be used to determine the set of values

of the problem entity inputs that optimizes a spe-

cic objective function. This function may depend

on one or more problem entity's outputs. (d) Aid in

V & V of the simulation: see Section 1.

2. Identify the inputs and their characteristics:

We distinguish between input variables (briey:

inputs) and parameters (see Zeigler, 1976). Inputs

are directly observable (for example, number of

servers), whereas parameters require statistical in-ference (for example, service rate k). (The simula-

tion computer program may treat these inputs and

parameters as ìnputs'.) For each input we should

determine whether the input is deterministic or

random; also we must select a measurement scale

(these scales namely, nominal, ordinal, interval,

ratio, and absolute are discussed in Kleijnen

(1987, pp. 138141)). The metamodel has ìnde-

pendent variables' x, which are functions of the

simulation inputs and parameters; for example,

x1 k2

; also see Fig. 2. The independent variablesof the metamodel should be identied. The simu-

lation inputs and parameters are called factors in

the design of experiments (DOEs). Even a random

simulation has deterministic inputs and parame-

ters; the only random element is the seed of its

pseudorandom number generator s0, but even this

seed may be xed.

3. Specify the domain of applicability (experi-

mental region): We should determine the combi-

nation of values of the factors for which the

metamodel is to be valid. (To obtain a valid sim-

ulation, its factor values also need to be restrictedto a certain domain; see the èxperimental frame'

concept in Zeigler, 1976.)

4. Identify the output variable and its charac-

teristics: The output should be identied, and

like the inputs this output must be classied as

random or deterministic, and a measurement scale

must be selected.

5. Specify the accuracy required of the meta-

model: We specify the range of accuracy required

of the output of the metamodel, for its domain ofFig. 2. Notation for I/O data of problem entity, simulation, and

metamodel.

16 J.P.C. Kleijnen, R.G. Sargent / European Journal of Operational Research 120 (2000) 1429


4/16

applicability, with respect to both the simulation

and the problem entity. This range depends on the

goal of the metamodel and also, perhaps, on the

goal of the simulation study. The accuracy re-

quired of the metamodel when applied to the

simulation may dier from the accuracy when it is

applied to the problem entity. The ranges of ac-

curacies may be specied via the validity measures

discussed in the next step.

6. Specify the metamodel's validity measures and

their required values: We determine the metamodel's

validity measures (e.g., absolute error and absolute

relative error), along with their required values.

7. Specify the metamodel, and review this speci-

cation: First we select a type of metamodel; ex-amples are polynomial regression models and

neural networks (nets). Next, we select a form of

the metamodel from the type selected; examples

are rst- or second-degree polynomials, and neural

nets with dierent levels and number of nodes. The

users of the metamodel should review the type and

form of metamodel specied, to ensure that the

particular metamodel is satisfactory for its in-

tended goals.

8. Specify a design including tactical issues, and

review the DOE: We determine the type of designto be used (e.g., 2kp) and the specic (say) n design

points for which data are to be collected from the

simulation. The resulting I/O data are partitioned

into two parts: one for tting and one for vali-

dating the metamodel. Determining the design is

frequently called a strategic issue in DOE. We

must also decide on tactical issues when the sim-

ulation is random: the number of observations

(runs) at each design point must be determined.

Further, we may use variance reduction techniques

(e.g., common pseudorandom numbers). There

may be interaction between strategic and tacticalissues; for example, common and antithetic num-

bers may be selected following the well-known

SchrubenMargolin strategy; see Donohue (1995).

In this step we must nally review all DOE aspects,

to determine whether they are satisfactory for the

goal, type, and form of metamodel specied and

for the validation of the metamodel.

9. Fit the metamodel: We run the simulation to

obtain those I/O data specied in Step 8 for tting

the metamodel. (We do not yet obtain the data for

determining validity; see the next step.) From these

data we estimate the parameter values of the

metamodel, using (say) least squares. We evaluate

these estimates, using mathematical and statistical

criteria. For example, if the metamodel is a linear-

regression model, then a statistical analysis can

determine if the metamodel is overtted (i.e., some

parameters can be set zero). If the assessment is

not satisfactory, then we repeat Steps 79.

10. Determine the validity of the tted meta-

model: We run the simulation to obtain the data

specied in Step 8 for validating the metamodel.

We then determine if the metamodel satises the

validity measures with respect to the simulation

(specied in Step 6), rst for the validity data set,and then for the tting data set. If the metamodel

satises the validity measures for both data sets,

then we consider the metamodel valid relative to

the simulation; else we repeat Steps 710. Lastly,

we determine the validity of the metamodel versus

the problem entity; the amount of testing and

evaluation depends on the goal of the metamodel.

If validity cannot be established with respect to the

problem entity, then we repeat Steps 710.

3. Discussion of some steps

In this section we detail some of the steps in the

metamodeling process presented in Section 2. We

do not devote equal space to each step: we are brief

on information widely known in the simulation

literature. Specically, Section 3.1 discusses Step 1

(determine the goal of the metamodel), Section 3.2

returns to Step 6 (specify the metamodel's validity

measures and their required values), Section 3.3

details Step 8 (specify a design including tactical

issues, and review the DOE), Section 3.4 treatsStep 9 (t the metamodel), and Section 3.5 covers

Step 10 (determine the validity of the tted meta-

model).

3.1. Step 1: Determine the goal of the metamodel

3.1.1. Goal 1 (understanding)

We distinguish the following three levels of un-

derstanding.



5/16

(a) Direction of output: does the output increase

or decrease for an increase in a specic input? This

goal requires at least ordinal measurement scales

for input and output. For this goal a metamodel

can often be simple; in fact, it may be desirable to

develop several simple models instead of one

complex model. The signs of the individual pa-

rameters of the metamodel should support prior

knowledge about the problem entity. For example,

in a queuing problem the mean waiting time

should increase with trac rate; so if a rst-order

polynomial is used and x1 denotes trac rate, then

its estimated b1 should be non-negative.

(b) Screening: give a `short list' of the most

important factors. For this goal, a rather crudemetamodel may suce (of course there is always

the danger that such a crude metamodel is mis-

leading). For example, a rst-order polynomial,

possibly augmented with cross-products (two-fac-

tor interactions) is used by a technique called se-

quential bifurcation; see Bettonvil and Kleijnen

(1997). An alternative technique is discussed in

Saltelli et al. (1995).

(c) Main eects, interactions, quadratic eects,

and other high-order eects: what are the eects of

a specic factor, possibly in combination withother factors? The type of metamodel may still be

simple: rst- or second-order polynomials, possi-

bly combined with transformations such as 1/x

and log(x). For example, in a case study of Flex-

ible Manufacturing Systems (FMSs) Kleijnen and

Standridge (1988) use simulation and metamodel-

ing: the metamodel suggests that of the four ma-

chine types, only types one and two are

bottlenecks that do interact. Other examples are

the single-server queues with dierent priority

rules, and the queuing networks in Cheng and

Kleijnen (1999): a second-degree polynomial canexplain the `exploding' behavior of such a simu-

lated system as the trac rate x approaches unity;

even better is the explanation by a rst-order

polynomial multiplied by the factor 1/(1 x).Regression analysis aims at more understand-

ing than classic Analysis of Variance (ANOVA)

and its extensions, namely Multiple Comparison

Procedures (MCPs) and Multiple Ranking Proce-

dures (MRPs). Indeed, in ANOVA we test the

null-hypothesis that a given number of system

congurations (populations) have the same mean

(no factor eect at all). Regression analysis, how-

ever, aims at estimating the magnitudes of the ef-

fects and at inter- and extrapolation. MCPs aim at

detecting clusters of system congurations with the

same mean per cluster. MRPs try to nd the best

system conguration among a limited set of con-

gurations. See Ehrman et al. (1996), Fu (1994),

and Goldsman and Nelson (1998).

Low-order polynomials are simpler to under-

stand than splines and neural nets. It is easy to

comprehend main eects, two-factor interactions,

and quadratic eects; they can be easily displayed

graphically.

The type of metamodel selected should be ap-propriate for the type of understanding desired.

For example, if it is desired to understand what is

occurring inside some system, then a metamodel

such as a polynomial over the entire domain of

applicability is probably more appropriate than

(say) a spline metamodel (several simple meta-

models tted to subdomains of applicability). A

spline model may be the best choice if the response

surface is expected to be highly complex and the

understanding desired is what inputs have the

most eect on the output.

3.1.2. Goal 2 (prediction)

Prediction requires either the absolute or the

relative magnitude of the output. For example, the

goals of the simulation study in Kleijnen (1995a)

are: (a) determine whether it is worthwhile to send

a ship equipped with sonar into a certain area, for

a mine search mission: absolute scale necessary; (b)

determine the relative eects of certain technical

sonar parameters: interval or ratio scale suces.

Other examples are the queuing systems in Cheng

and Kleijnen (1999); these examples require up toa fourth and sixth-degree polynomial, respectively.

In general, if the metamodel is used for pre-

diction, then the type of metamodel must be se-

lected with extreme care: (a) The simulation's I/O

transformation to be approximated by the meta-

model often has complex behavior over the do-

main of applicability; see Sargent (1991). (b) The

accuracy required for a predictive metamodel is

usually very high. If, for example, a polynomial is

used directly on the simulation data, then it



6/16

probably will have to be of high order (with nu-

merous interaction terms). However, if a good

transformation of data can be found (see Cheng

and Kleijnen, 1999), then a simpler polynomial

regression model can be used. Several other types

of metamodels may be appropriate; for example,

neural nets and splines (see Section 1).

To predict an output for a new set of input

values, the simulation itself can be used if simu-

lation run time is not prohibitive. In real-time

control, however, a metamodel may be preferred;

examples are found in production scheduling and

nancial markets.

3.1.3. Goal 3 (optimization)Optimization is a well-known topic in Opera-

tions Research. Closely related is goal seeking:

given a target value for the output, nd the cor-

responding input values. There are many mathe-

matical techniques; for example, Response Surface

Methodology (RSM), sequential simplex search,

genetic algorithms, simulated annealing, and tabu

search; see Kleijnen (1999). Metamodels for opti-

mization that account for both the mean and the

variance of the output is the focus of Taguchi's

methods; see Kleijnen and Gaury (1998) andSanchez et al. (1996).

Consider one specic technique, namely RSM

(books and review articles are referenced in Khuri,

1996b, p. 377). RSM relies on low-order polyno-

mial regression metamodels. Local marginal eects

are estimated to nd the direction of improvement.

In the local-exploration phase, RSM uses a se-

quence of rst-order polynomial metamodels,

combined with steepest ascent search. In the nal

optimization phase, RSM uses a single second-

order polynomial metamodel. Clearly, optimiza-

tion is more demanding than `understanding' andless demanding than prediction.

3.1.4. Goal 4 (V & V of simulation)

We present two ways that metamodels can aid

goal 4. The rst way determines whether the

metamodel has eects with the correct signs: does

the output in the metamodel respond to changes in

the inputs in the direction that agrees with prior,

qualitative knowledge about the simulated prob-

lem entity? A simple academic example is a

queuing simulation: waiting time increases when

the trac rate increases. Suppose that for low

values of the trac rate the metamodel is a rst-

order polynomial. Then the estimated parameter

relating these two variables should be positive; else

there is an error somewhere. Two case studies are

the ecological simulation in Kleijnen et al. (1992)

and the military simulation in Kleijnen (1995a).

The second way to this goal becomes relevant

when the problem entity is easily observable. Then

comparisons can be made between the simulation's

and the problem entity's outputs for numerous

experimental conditions over the simulation's do-

main of applicability. Since the collection of data

on the problem entity is usually costly, it is im-portant to do these data gathering eciently. So it

must be decided how many experimental condi-

tions to observe, and where to locate them. This

depends on the complexity of the simulation's I/O

transformation over the domain of applicability.

For example, whether the metamodel is linear or

quadratic guides the number and location of data

points needed for comparison; that is, DOE can be

applied. The resulting design is a plan for com-

paring the problem entity to the simulation for

validation of the simulation. See Chen (1985).In general, a simulation is supposed to meet the

V & V requirements only for a certain application

domain. Hence, V & V of simulations requires a

test plan; we proposed to base that plan on DOE.

But DOE is substantially aided by the use of an

explicit metamodel! However, that metamodel

should be validated with respect to the simulation.

3.2. Step 6: Specify the metamodel's validity

measures and their required values

Fig. 1 shows that we wish to determine the

validity of metamodel versus problem entity. We

may use the approaches also used for validating a

simulation. High validity requires that problem

entity outputs be obtained for numerous experi-

mental conditions. Unfortunately, this is usually

impossible. Therefore, the validity of a metamodel

is determined by making (i) many comparisons

between the outputs of the metamodel and the

simulation, and (ii) a few comparisons between the



7/16

metamodel and the problem entity, using whatever

approaches are appropriate. If the problem entity

is unobservable, then it is impossible to determine

whether the metamodel satises any specic nu-

merical accuracy with respect to the problem en-

tity. (A special case is the simulation of a computer

system: the simulation usually runs slower than the

real system, if the latter exists.)

Fig. 1 further shows that the dierence between

the metamodel and the problem entity's responses

result from a combination of two approximations:

(i) metamodel versus simulation, and (ii) simula-

tion versus problem entity. These two approxi-

mations either add to each other, or partially

cancel each other. It is not uncommon to havedierent ranges of accuracy required of the simu-

lation and of the metamodel.

Whenever the simulation has continuous fac-

tors, it is impossible to compare metamodel and

simulation over the complete domain of the

metamodel's domain. Instead, DOE determines

the data points actually used in these comparisons

A metamodel may be valid for one subdomain,

not for another. For example, in queuing simula-

tions a rst-order polynomial regression may be

valid for low trac rates, whereas higher tracrates require more sophisticated metamodels; see

Cheng and Kleijnen (1999).

A metamodel is modied until it is valid for all

experimental conditions tested, using the specied

validity measures and their required values. When

statistical tests are used to determine validity, the

probabilities of types I and II errors can be cal-

culated sometimes; see Balci and Sargent (1981). A

type II error occurs if the variance of an estimated

factor eect is high (high output variance or small

total sample size). A type I error occurs if that

variance is small (large sample sizes do occur insimulation).

To decide whether to accept a specic model,

we need a criterion, which depends on the model's

goal. Some case studies use the Absolute Relative

Error or ARE (say) r1wY y jw yawj orsimply r1 or r (simulation output w and metamodel

output y; see Fig. 2).

However, this measure is decient if w may be

zero; for example, w may be the Net Present Value

(NPV) in a nancial analysis such as the case-study

in Van Groenendaal and Kleijnen (1997). There-

fore we may prefer the Absolute Error or AE (say)

r2 jw yj (numerator of r1). Kleijnen (1995a)gives as example: `if the target is not detected

within 5 m, the weapon is ineective'.

Another popular measure is the Mean Squared

Prediction Error (MSPE); see Diebold and Mar-

iano (1995). This reference also considers more

general `economic loss' functions, for example,

asymmetric functions.

Besides the selection of a criterion r, we should

quantify a threshold (say) rmax, for example,

rmax 0.1.Since a metamodel is a derivative of the un-

derlying simulation, both may use the same crite-rion. So ideally, the metamodel's output y should

be compared with the problem entity's output z

(Fig. 2), which gives r1(z, y), analogous to r1(z, w).

In the remainder of this section we focus on

comparing the metamodel and the simulation. The

value of the criterion r(w, y) varies with the `system

conguration', determined by DOE's vector of k

factor values d d1Y F F F Y dkH:

r1wdY yd jwd ydawdjX 1

These values may be characterized by theirmean (say) c1 or by their maximum (say) c2:

c1

d

rwdY ydddY c2

maxd

rwdY ydX 2

The choice between mean and maximum again

depends on the problem. For example, in the in-

surance business, an individual policy holder may

generate a loss, but on average the insurance

company makes a prot. So, if a model is usedmany times and small errors may compensate

large errors, then the mean c1 is adequate. Exam-

ples are the neural-network metamodels for eco-

nomic and nancial problems in Verkooyen

(1996). If, however, a single large error is cata-

strophic, then the maximum c2 is a better charac-

teristic. Examples are nuclear simulations and

econometric time-series models. (Dierent criteria

for model selection are examined in the statistical

monograph by Linhart and Zucchini, 1986.)



8/16

How can the quantities c1 and c2 be estimated?

Dierent statistics may be used to estimate the

mean (c1); most popular is the sample average

(say) "r. This average has attractive properties

(maximum likelihood; minimum variance) pro-

vided certain assumptions hold (e.g., normally,

identically, and independently or NID distributed

r). A more robust statistic is the median of the

sample of size n (say) rna2 (see Kleijnen, 1987). To

estimate the maximum (c2), we may take the

sample maximum rn. Diebold and Mariano

(1995) discuss several statistical tests for compar-

ing predictive accuracy.

Two applications of the accuracy measure ARE

and its maximum, but without an explicit a priorithreshold rmax are the coal-transportation system-

dynamics study in Kleijnen (1995c) and the FMS

simulation in Kleijnen and Standridge (1988).

Both examples are deterministic simulations. (The

mean AE is used in a case study on forecasting the

dollar/Dutch guilder in Diebold and Mariano

(1995); that study, however, concerns an econo-

metric time-series model.)

The literature on random simulations focuses on

the precision of a single characteristic (e.g., mean

steady-state waiting time). This precision is mea-sured by the condence interval half-width, ex-

pressed in either absolute units (say, seconds) or a

percentage (relative precision). Such simulations

have intrinsic noise: while the input vector d is

constant, the simulation output w still shows vari-

ation. This variation may be measured by the

variance varwjd with 1Y F F F Y n. Consequently,it is no longer obvious what the simulation output

w is, when measuring the accuracy; see (Eq. (1)).

Actually this output has a statistical distribution

determined by the simulation. This distribution

may be characterized through dierent quantities,such as its mean E(w) and its 90% quantile wX90.

These quantities may be estimated based on simu-

lation replications (alternatives are renewal analy-

sis, spectral analysis, standardized time series; see

Alexopoulos and Seila, 1998). Then in Eq. (1) w

becomes the estimated mean (say) "w or the esti-

mated quantile w0X9m with m denoting the number

of replications for the factor combination. We may

formulate a null-hypothesis H0 and an alternative

hypothesis H1 (analogous to r2 and rmax):

H0X jiw iyjT dY H1X jiw iyj b dY

3

where the threshold d depends on the problem.Usually such a hypothesis is tested through a

Student statistic, tm. This statistic resembles the

ARE, but now the standardization is done by the

standard error vary "w1a2, not by the simula-tion output w. Actually, several hypotheses need to

be formulated, one for each of the n factor com-

binations. Their simultaneous testing may use

Bonferroni's inequality. Also see Diebold and

Mariano (1995).

Instead of the classic statistical test statistics

(such as the t statistic), we may apply bootstrap-ping, which allows the estimation of the distribu-

tion of any statistic, for any distribution; see Efron

and Tibshirani (1993). Recently, bootstrapping

has been applied to the validation of both simu-

lation and metamodels by Kleijnen et al. (1998)

and Kleijnen et al. (1999), respectively.

3.3. Step 8: Specify a design including tactical

issues, and review the DOE

To save space we assume that the reader has

some familiarity with classic DOE, which as-

sumes a low-order polynomial regression meta-

model for a single output with NID output with

common variance (say) r2 r2. For other

metamodels there are no standard solutions; in-

stead a computerized search for `optimal' designs,

given the metamodel is performed. For example,

Sacks et al. (1989) assume that the response

function is smooth, and that a covariance-sta-

tionary process is a valid model for the tting

errors (the closer two factor combinations are,the more correlated the tting errors are). They

use splines as metamodels. Their designs do not

show regular geometric patterns. Also see Welch

et al. (1992).

We introduce some more symbols: main eects

bj j 1Y F F F Y k, two-factor interactions bjY jH j `jH and jH 2Y F F F Y k, quadratic eects bjY j; totalnumber of regression parameters q; bold letters for

matrices including vectors. For certain k values, a

design is saturated; that is, n q.



9/16

DOE implies an n kdesign matrix D dY j.For example, a rst-order polynomial metamodel

is y X1b1 where X1 is the n (k + 1) matrix ofindependent variables with row vectors x 1YY 1Y F F F YY kYX1 is xed by D since X1 1YDwith 1 1Y F F F Y 1H.

Classic DOE uses Ordinary Least Squares

(OLS) tting (mathematical criterion). So the OLS

estimator b is computed by tting the specied

linear-regression metamodel to a set of I/O data

that consist of the n q matrix of independent

regression variables X and the n-dimensional vec-

tor of simulation outputs w w; Fig. 2 showsthese data for a single factor combination. This

gives b XHX1

XHw, which assumes that theinverse exists; DOE must ensure that X (xed by

D) is indeed not collinear.

Knowledge of the problem entity should inspire

the selection of the metamodel type, which in turn

guides the selection of the design.

Generalizations of OLS are Weighted LS

(WLS) and Generalized LS (GLS). WLS uses the

standard deviations of the outputs r as weights.

GLS is recommended when common pseudoran-

dom numbers are used so the simulation outputs

are correlated. For details on WLS and GLS werefer to Kleijnen (1999).

The goal of DOE may be to minimize the

variances of the estimated factor eects bj. It can

be proven that an orthogonal design matrix D is

optimal, given the classic assumptions. Classic

design classes are resolution III or R-3, R-4, R-5,

and Central Composite (CC) designs. We (briey)

discuss these designs because we shall apply them

later; more examples and references are given in

Kleijnen (1999).

By denition, R-3 gives unique estimates of the

k + 1 parameters in a rst-order polynomial, usingonly n k + 4 (k modulo 4) combinations; forexample, k 7 requires n 8 274.

R-4 gives unique estimates of all rst-order ef-

fects, plus unique estimates of certain linear com-

binations of two-factor interactions. For example,

when k 8 then a 284 design may estimateb1; 2 + b4; 8 + b3; 7 + b5; 6 (see Kleijnen, 1999, pp.

304305). R-4 is simply constructed from R-3

through the Foldover principle: to the original

R-3's D, we add the mirror image D, which

yields a R-4 with n equal to 2krounded upward to

the next power of two. For example, n 16 whenk 5, 6, 7 or 8.

R-5 gives estimators of all main eects and all

two-factor interactions not biased by each other.

Because R-5 requires many factor combina-

tions, these designs are used only for small values

of k.

CC assumes a second-degree polynomial. A CC

design is easily constructed, once a R-5 design is

available: simply add 2k points changing each

factor one at a time, once increasing and once

decreasing each factor in turn; also add the `center'

point. A disadvantage is its high number of com-

binations n ) q 1 k kk 1a2 k and itsnon-orthogonality.

These four design classes imply a k-dimensional

hypercube for the experimental domain, expressed

in standardized factors. We recommend the fol-

lowing standardization (see Bettonvil and Kleij-

nen, 1997). Let the original factor j denoted by vj,

range between low value lj and upper value uj; that

is, the simulation is not valid outside that range.

Measure variation by the half-range j uj lja2 and location by the mean j

uj lja2. Then the design matrix D dY j usesdY j Y j jaj.If a qualitative factor (nominal scale) has more

than two levels or `values', then several binary

variables should be used for coding this factor

in regression analysis. The sonar case-study in

Kleijnen (1995a) erroneously coded three kinds

of bottom (rock, sand, mud) as the values 1, 2,

and 3. Most ANOVA software helps avoid such

errors.

Next we focus on stagewise DOE, because in

simulation the computer generates runs one after

the other. Moreover, classic DOE assumes a spe-cic polynomial regression metamodel, whereas

we suppose that it is uncertain what the correct

degree of the polynomial is. Therefore we recom-

mend R-4 over R-3. The foldover principle, how-

ever, implies that in the rst stage we run a R-3

to estimate the rst-order eects. Some R-3 are

saturated, so it is impossible to apply cross-

validation. However, certain values of k give a

non-saturated design; for example, k 4, 5 or 6requires n 8.



10/16

After we have run the R-3, we run the mirror

design. Unlike classic DOE, we may execute the

combinations of the mirror design one after an-

other, and apply cross-validation. Once we have

run all combinations of the mirror design, we

compute sums of certain two-factor interactions.

These tell which interactions explain lack of t of

the rst-order metamodel. For example, suppose

factor 1 has an unimportant rst-order eect and

this factor is not involved in an important sum of

interactions b1Y 2Y F F F Yb1Y k. Then we may remove

this factor, to avoid overtting. As in any model-

ing eort, there are no foolproof recipes: factor 1

may still have a quadratic eect.

It may take too much time to run the mirrordesign. Then we may run only the center point.

If we nd out that a second-order polynomial

metamodel is needed, then we expand the R-3 or

the R-4 to a CC. Again, stagewise experimentation

is possible: add 2k axial points plus the center

point. The R-4 may not be a proper subset of the

CC; then some combinations may be used for

checking the t of the second-order polynomial

metamodel.

Once we have run a CC and tted a second-order

polynomial, we check this model's t: third-ordereects may be important. As new check-points we

take combinations not yet simulated: part of the CC

is a fractional two-level factorial (e.g., 2kp), as-

suming k > 4 (see Kleijnen, 1999, p. 309); hence a

fraction of the full 2k is not yet simulated. Which

check-points to select, may be determined ran-

domly, if no other information is available.

The CC's check-points require extrapolation,

whereas the center point implies interpolation. In

general, a metamodel is more reliable when used

for interpolation; it may be dangerous when used

to extrapolate the simulated behavior far outsidethe domain simulated in the DOE.

3.4. Step 9: Fit the metamodel

After the parameters of a metamodel have been

estimated, the t of the metamodel should be

evaluated by using one or more measures. The lack

of t measures discussed in this section may be

applied to any type of metamodel (linear-regres-

sion, neural net, etc.), tted to either random or

deterministic simulation output (Section 4 will

focus on random simulations and linear regression

metamodels). Random simulation usually has

several observations per combination, say m. So

the total number of simulation observations (say)

N equalsn

1 mX Let y denote the metamodel'soutput for the estimated parameter vector bY wY rdenote simulation observation r for combination i;

"w is the average of all N simulation outputs.

The classic measure is the coecient of deter-

mination

2

1 n1

mr1

y4

wY r25D n

1

mr

wY r4

w25X

4

R2 equals one (perfect t) if all N metamodel

outputs equal their corresponding simulation

outputs y wY r, which holds ifN q (saturateddesign with m 1). Because R

2 always increases as

more regression variables are added (higher q), the

adjusted R2 is introduced:

2adjusted 1 1

2

x 1ax qX 5

Unfortunately, there is no simple lower threshold

for this criterion; bootstrapping may help.

The linear correlation coecient q was origi-

nally developed to characterize a bivariate normal

variable (say) (w, y): all observations lie on a

straight line with intercept a0 and slope a1 given by

a0 iy a1iwY

a1 qwY y

varyavarw

pX 6

Hence, if qwY y 1, and y and w have equalmeans and equal variances, then this line has slope

one and intercept zero. In metamodeling, however,

this measure must be interpreted with care: in

Fig. 2 the variables w and y are not bivariate

normal. Nevertheless, y can be regressed on w.

Ideally, simulation and metamodel outputs are

equal (see R2), which means zero intercept and

unity slope. A case study on gas transmission in

Indonesia indeed uses such a plot; see Van Groe-

nendaal and Kleijnen (1997).



11/16

Cross-validation (`tting to subsets' would be a

better term) uses the tted metamodel to predict

the outcomes for new combinations of the simu-

lation, and to compare these predictions with the

corresponding simulation responses. `Leave-one-

out' cross-validation eliminates one combination

(instead of adding new combinations); re-estimates

the metamodel from the remaining n 1 combi-nations, assuming n 1 observations suce (non-saturated design); recomputes the forecast y and

compares this forecast with the simulation output

w; repeats this elimination for a dierent combi-

nation. In practice, the relative prediction errors

yaw are often èye balled', given the goals of the

model. Examples are the deterministic simulationsfor a FMS in Kleijnen and Standridge (1988) and a

coal transport system in Kleijnen (1995c). (We

shall discuss cross-validation for random simula-

tion; see Section 4)

Only if the metamodel as a whole ts well

(measured as explained above), analysts should

examine individual parameters. In cross-validation

they may observe how these estimated parameters

change as simulated combinations are deleted; if

the specied metamodel is a good approximation,

then these estimates remain stable. Case studies arethe FMS and the coal-transport studies mentioned

above. In these polynomial metamodels the pa-

rameters can be interpreted as factor eects, so the

metamodel can be used for explanation, whereas

the yaw concern prediction. Other metamodels(e.g., neural nets) are harder to interpret.

Eliminating noisy parameters may decrease the

variance of the remaining parameter estimators b

and the corresponding predictions y. For example,

in M/M/1 simulation Cheng and Kleijnen (1999) t

a rst-order and a second-order polynomial re-

spectively; the former has a variance seven timessmaller than the overtted metamodel! Elimina-

tion of unimportant factors may also increase the

adjusted R2. Kleijnen and Standridge (1988) apply

cross-validation to a rst-order polynomial with

k 4, which gives unstable estimates for the fac-tors 1 and 3 and high values for the relative pre-

diction errors. Therefore they replace the rst-

order by a second-order polynomial for only k 2stable factors; they set non-stable factor eects to

zero. Now they get an adequate metamodel: stable

eects b2Y b4Y b2Y 4 and low prediction errors. Ingeneral, the mission of science is to come up with

simple explanations: parsimony or Occam's razor.

3.5. Step 10: Determine the validity of the tted

metamodel

The eort devoted to validating a metamodel

depends on the goal of this metamodel. If the goal

is ùnderstanding', then a `reasonable' eort

should be spent. However, for `prediction' exten-

sive validation should be performed. In case of

òptimization', a series of metamodels are usually

developed; validity testing may be limited to thelast few metamodels, which are tested extensively.

Finally, if the goal is àiding in V & V of a simu-

lation', then validation is usually conducted only

with respect to the simulation (not the problem

entity).

The data for validating a metamodel versus the

simulation must rst be generated. In Step 8 we

developed a DOE for data generation in two parts:

one for tting, and one for validating. Prior to

generating the data for validation, the DOE needs

to be reviewed. It may need modication becausethe original design was èxpanded' in tting and

checking the t. DOE should allow testing during

the validation for a higher-order model than the

tted metamodel. After the metamodel has passed

validity with the new data, the data for tting may

also be used for additional validity testing. This is

especially worthwhile if the measures used in va-

lidity and tting are dierent.

Next we conduct validation of the metamodel

versus the problem entity. The extend of this vali-

dation again depends on the goal of the meta-

model. The same methods for validatingsimulations (see Section 1) can be used for vali-

dating metamodels versus the problem entity. The

c1, c2, in Eq. (2) and the statistical formulas given

above, can now be used for testing validity, but the

symbol w (simulation output) is now replaced by z

(problem entity output). Suppose that the problem

entity is observable and data have been collected

on it and used for validating the simulation. Then

it seems reasonable to use these same data when

validating the metamodel versus the problem



12/16

entity. The resulting statistical condence level

seems to deserve more research.

4. A procedure for linear-regression metamodeling

in random simulation

In this section we focus on random simulations

and linear-regression metamodels, highlighting

Steps 810. If this metamodel is estimated through

OLS, the variance of its output is vary xHrbx; for GLS we refer to Kleijnen (1999). De-

noting cross-validation with elimination of I/O

combination i by the subscript-i gives

b XHX

1XHw with 1Y F F F Y nX 7

Actually, these n recomputations can be avoided,

using the original b and the so-called hat matrix

XHX1XH (see Kleijnen and Van Groenendaal,

1992, pp. 156157). Using b gives the predictor y.

The Studentized cross-validation prediction error is

tm w y

varw

vary

1a2Y 8

where the degrees of freedom m are unknown.

Kleijnen (1992) uses m m 1. To control thetype I error rate, we apply Bonferroni's inequality:

replace a by a/n. Diagnostic statistics in modern

software related to our cross-validation measures

are PRESS, DEFITS, DFBETAS, and Cook's D.

An alternative is the F lack-of-t test, which

compares two estimators of the common variance

of simulation outputs (r2). The rst estimator is

based on replication:

s2 m

j1

wY j "w2am 1Y 9

where "w is the average of m independent simu-

lation replications. The weighted average of these

n estimators isn

1 s2 maxan. The second

variance estimator isn

1 e2man q with the

estimated residuals e "w y. The latter esti-mator is unbiased only if the regression model is

specied correctly; otherwise it overestimates. The

two estimators are compared through pnqY xn.

Rao (1959) extends this test from OLS to GLS.

Kleijnen (1992) shows that Rao's test is better than

cross-validation if the simulation responses are

symmetrically distributed, for example, normally

or uniformly distributed; lognormally distributed

responses are better analyzed through cross-vali-

dation (or bootstrapping of Rao's statistic; see

Kleijnen et al. (1999)).

Most simulations have multiple outputs (e.g.,

customer's queuing time and server's idle time;

mean and 90% quantile). In practice, multiple

outputs are handled through the application of the

techniques of this paper per output type. Actually,

for linear-regression metamodels Khuri (1996b,

p. 385) proves that the best linear unbiased esti-

mator (BLUE) of the factor eects for the various

responses remains the same as the OLS estimatorsobtained from tting per individual output, as-

suming a single experimental design is used for the

data generation of all outputs. Ideally, the design

should account for the presence of multiple out-

puts; see Khuri (1996a, b) and Kleijnen (1987).

Simultaneous inference may be based on Bonfer-

roni's inequality. Optimization in the presence of

multiple responses is discussed in Kleijnen (1999).

If a metamodel does not pass the tting and

validation tests, then instead of adding interac-

tions and simulating the combinations of theaugmented design we may try transformations or

disaggregate an independent variable into its

components (e.g., disaggregate trac rate into

arrival rate and service rate).

In Fig. 3 we give a owchart for this meta-

modeling process, focusing on Steps 810 re-

stricted to linear-regression metamodels. We add

the following comments.

Blocks 13, 16, and 19 state `Go back to 1'.

Obviously, in such an iteration a step may de-

generate (e.g., Step 3, select type of design, may

use exactly the same design as the one used in thepreceding iteration). A practical example is the

FMS case study in Kleijnen and Standridge (1988):

a rst-order polynomial in four factors is rejected,

and replaced by a rst-order polynomial in only

two of the original four factors augmented with

the interaction between the two remaining factors;

the same design with its I/O simulation data is

used.

Goals may aect certain blocks; for example,

optimization through RSM implies that block 2



13/16

Fig. 3. A procedure for linear-regression metamodeling in random simulation



14/16

will always select a rst-order polynomial in the

rst iterations. Yet we feel that the gure is suited

to all four goals of metamodeling: this gure is

based on years of experience with linear regression

for all four goals of metamodeling.

After validating the metamodel versus the

simulation, this metamodel should be validated

against the problem entity; to save space, these

steps are not displayed in the gure.

5. Research issues

A few academic studies use other mathematical

norms than L2 (least squares): see Kleijnen (1987).

Well-known alternative norms are L1 (absolute

tting errors) and LI (Tchebyshev, maximum er-

ror):

v1X minb

n1

jw ybjY vIX minb

max

jw ybjX

10

Current practice is as follows. Simulationists use a

convenient mathematical norm to t a specic

metamodel to the simulation I/O data. Next, theyoften use a dierent norm when they examine the

resulting t and validity of the metamodel! Ex-

amples are provided by the case studies mentioned

above: Kleijnen (1995a) and Kleijnen and Stan-

dridge (1988) use L2 (OLS) to compute the esti-

mated parameter vector b, but vI to decide on the

adequacy of the resulting metamodel. Verkooyen

(1996), however, uses the same norm, namely L2,

in both tting and validating his neural net. No

research has yet been done on how to compute the

metamodel's parameters such that c1 (mean inac-

curacy in validation; see Eq. (2)) or c2 (maximum)is minimized.

In random simulations, cross-validation and

the lack-of-t F-statistic are usually applied to test

zero lack of t. However, we may wish to test for a

non-zero threshold (d > 0 in Eq. (3)).

Accuracy of metamodel versus problem entity

(not simulation) has not been investigated in the

literature.

The four dierent goals of metamodeling may

require dierent metamodel types and dierent

accuracies. Fig. 3 neglects this problem; more re-

search is needed.

Multi-variate regression analysis and multi-

variate DOE for metamodeling certainly deserve

more research.

A procedure such as outlined in Fig. 3 (linear

regression in random simulation) should also be

developed for other metamodel types such as

neural nets and splines.

6. Summary

Because metamodels are often used in an ad

hoc way, we derived a methodology for developingmetamodels with the following characteristics: (i)

We strictly distinguished between tting and vali-

dating. (ii) We emphasized the role of the problem

entity. (iii) We distinguished four types of goal.

This yielded a metamodeling process with 10 steps.

We also gave a detailed procedure for validating

linear-regression metamodels in random simula-

tion, suitable to all four goals.

We covered several metamodel types, including

linear regression and neural nets, but we focused

on polynomial metamodels. The four types ofgoals are: (i) understanding, (ii) prediction, (iii)

optimization, and (iv) V & V of the underlying

simulation.

Our methodology with 10 steps includes classic

DOE and standard measures of t, such as the R-

square coecient and various cross-validation

measures. The methodology, however, also in-

cludes stagewise DOE and several validation cri-

teria, measures, and estimators.

Acknowledgements

CentER provided nancial support enabling

R.G. Sargent to visit Tilburg University during

one month for research on the topic of this paper.

References

Alexopoulos, Seila, 1998. Output data analysis. In: Banks, J.

(Ed.), Handbook of Simulation. Wiley, New York.



15/16

Balci, O., Sargent, R.G., 1981. A methodology for cost-risk

analysis in the statistical validation of simulation models.

Communications of the ACM 24 (4), 190197.

Barton, R.R., 1993. New tools for simulation metamodels.IMSE Working Paper 93110, Department of Industrial

and Management Systems Engineering, Penn State Univer-

sity, University Park, PA 16802.

Barton, R.R., 1994. Metamodeling: A state of the art review.

In: Tew, J.D., Manivannan, S., Sadowski, Seila, A.F.,

(Eds.), Proceedings of the 1994 Winter Simulation Confer-

ence, pp. 237244.

Beck, M.B., Ravetz, J.R., Mulkey, L.A., Barnwell, T.O., 1997.

On the problem of model validation for predictive exposure

assessments. Stochastic Hydrology and Hydraulics 11, pp.

229254.

Bettonvil, B., Kleijnen, J.P.C., 1997. Searching for important

factors in simulation models with many factors: Sequential

bifurcation. European Journal of Operational Research 96(1), 180194.

Chen, B., 1985. A statistical validation procedure for discrete

event simulation over experimental regions. Ph.D. Disser-

tation, Department of Industrial Engineering and Opera-

tions Research, Syracuse University, Syracuse, NY.

Cheng, R.C.H., Kleijnen, J.P.C., 1999. Improved designs of

queuing simulation experiments with highly heteroscedastic

responses. Operations Research (forthcoming).

Diebold, F.X., Mariano, R.S., 1995. Comparing predictive

accuracy. Journal of Business and Economic Statistics 13

(3), 253263.

Donohue, J.M., 1995. The use of variance reduction techniques

in the estimation of simulation metamodels. In: Alexopou-

los, C., Kang, K., Lilegdon, W.R., Goldsman, D., (Eds.),

Proceedings of the 1995 Winter Simulation Conference, pp.

195199.

Efron, B., Tibshirani, T.J., 1993. Introduction to the Bootstrap.

Chapman and Hall, New York.

Ehrman, C.M., Hamburg, M., Krieger, A.M., 1996. A method

for selecting a subset of alternatives for future decision

making. European Journal of Operational Research 96,

407416.

Friedman, L.W., 1996. The Simulation Metamodel. Kluwer,

Dordrecht, The Netherlands.

Fu, M.C., 1994. Optimization via simulation: A review. Annals

of Operations Research 53, 199247.

Goldsman, D., Nelson, B.L., 1998. Comparing Systems viaSimulation. In: Banks, J. (Ed.), Handbook of Simulation.

Wiley, New York.

Huber, K.P., Berthold, M.R., Szczerbicka, H., 1996. Analysis

of simulation models with fuzzy graph based metamodeling.

Performance Evaluation 27/28, 473490.

Khuri, A.I., 1996a. Analysis of multiresponse experiments: A

review. In: Ghosh, S. (Ed.), Statistical Design and Analysis

of Industrial Experiments, Marcel Dekker, New York, pp.

231246.

Khuri, A.I., 1996b. Multiresponse surface methodology. In:

Ghosh, S., Rao, C.R. (Eds.), Handbook of Statistics, vol.

13. Elsevier, Amsterdam, pp. 377406.

Kleijnen, J.P.C., 1987. Statistical tools for simulation practi-

tioners. Marcel Dekker, New York.

Kleijnen, J.P.C., 1992. Regression metamodels for simulation

with common random numbers: comparison of validationtests and condence intervals. Management Science 38 (8),

11641185.

Kleijnen, J.P.C., 1995a. Case study: Statistical validation of

simulation models. European Journal of Operational Re-

search 87 (1), 2134.

Kleijnen, J.P.C., 1995b. Verication and validation of simula-

tion models. European Journal of Operational Research 82

(1), 145162.

Kleijnen, J.P.C., 1995c. Sensitivity analysis and optimization of

system dynamics models: Regression analysis and statistical

design of experiments. System Dynamics Review 11 (4),

275288.

Kleijnen, J.P.C., 1999. Experimental design for sensitivity

analysis, optimization, and validation of simulation models.In: Banks, J., (Ed.), Handbook of Simulation. Wiley, New

York. (Preprint: CentER Discussion Paper, no. 9752.).

Kleijnen, J.P.C., Gaury, E., 1998. Risk analysis of robust

system design. In: Medeiros, D.J., Watson, E.F., Carson,

J.S., Manivannan, M.S. (Eds.), Proceedings of the 1998

Winter Simulation Conference, pp. 15331540.

Kleijnen, J.P.C., Standridge, C., 1988. Experimental design and

regression analysis: An FMS case study. European Journal

of Operational Research 33 (3), 257261.

Kleijnen, J.P.C., van Groenendaal, W., 1992. Simulation: A

statistical perspective. Wiley, Chichester.

Kleijnen, J.P.C., Cheng, R.C.H., Bettonvil, B., 1998. Validation

of trace-driven simulation models: bootstrapped tests.

Working Paper.

Kleijnen, J.P.C., Feelders, A.J., Cheng, R.C.H., 1998. Boot-

strapping and validation of metamodels in simulation. In:

Medeiros, D.J., Watson, E.F., Carson, J.S., Manivannan,

M.S. (Eds.), Proceedings of the 1998 Winter Simulation

Conference, pp. 701706.

Kleijnen, J.P.C., Van Ham, G., Rotmans, J., 1992. Techniques

for sensitivity analysis of simulation models: a case study of

the CO2 greenhouse eect. Simulation 58 (6), 410417.

Linhart, H., Zucchini, W., 1986. Model selection. Wiley, New

York.

Pierreval, H., 1996. A metamodel approach based on neural

networks. International Journal in Computer Simulation 6

(3) 365378.Rao, C.R., 1959. Some problems involving linear hypothesis in

multivariate analysis. Biometrika 46, 4958.

Sacks, J., Welch, W.J., Mitchell, T.J., Wynn, H.P., 1989. Design

and Analysis of Computer Experiments (Includes Com-

ments and Rejoinder). Statistical Science 4 (4), 409435.

Saltelli, A., Andres, T.H., Homma, T., 1995. Sensitivity analysis

of model output; performance of the iterated fractional

factorial design method. Computational Statistics and Data

Analysis 20, 387407.

Sanchez, S.M., Sanchez, P.J., Ramberg, J.S., Moeeni, F., 1996.

Eective engineering design through simulation. Interna-

tional Transactions Operational Research 3 (2) 169-185.



16/16

Sargent, R.G., 1991. Research issues in metamodeling. In:

Nelson, B.L., Kelton, W.D., Clark, G.M. (Eds.), Proceed-

ings of the 1991 Winter Simulation Conference, pp. 888

893.Sargent, R.G., 1996. Verifying and validating simulation

models. In: Charnes, J.M., Morrice, D.M., Brunner, D.T.,

Swain, J.J. (Eds.), Proceedings of the 1996 Winter Simula-

tion Conference, pp. 5564.

Schlesinger, S., et al., 1979. Terminology for model credibility.

Simulation 32 (3), 103104.

Van Groenendaal, W.J.H., Kleijnen, J.P.C., 1997. On the

assessment of economical risk: Factorial design versus

Monte Carlo methods. Journal of Reliability Engineering

and Systems Safety 57 (1), 103105.

Verkooyen, W.J.H., 1996. Neural networks in economic

modelling. CentER, Tilburg University, Tilburg.Welch, W.J., Buck, R.J., Sacks, J., Wynn H.P., c.s., 1992.

Screening, predicting, and computer experiments. Techno-

metrics 34 (1), 1525.

Yu, B., Popplewell, K., 1994. Metamodel in manufacturing: a

review. International Journal of Production Research 32 (4),

787796.

Zeigler, B., 1976. Theory of Modelling and Simulation. Wiley/

Interscience, New York.


kleijnen - a methodology for fitting and validating metamodels in simulation

Documents