simulation optimization using the cross-entropy...

4

Simulation Optimization Using theCross-Entropy Method with OptimalComputing Budget Allocation

DONGHAI HE

George Mason University

LOO HAY LEE

National University of Singapore

CHUN-HUNG CHEN

George Mason University

MICHAEL C. FU

University of Maryland, College Park

and

SEGEV WASSERKRUG

IBM Haifa Research Lab

We propose to improve the efficiency of simulation optimization by integrating the notion of optimalcomputing budget allocation into the Cross-Entropy (CE) method, which is a global optimizationsearch approach that iteratively updates a parameterized distribution from which candidate so-lutions are generated. This article focuses on continuous optimization problems. In the stochasticsimulation setting where replications are expensive but noise in the objective function estimatecould mislead the search process, the allocation of simulation replications can make a significant dif-ference in the performance of such global optimization search algorithms. A new allocation scheme

This work has been supported in part by NSF under Grants IIS-0325074, DMI-0540312, and DMI-0323220, by NASA Ames Research Center under Grants NAG-2-1643 and NNA05CV26G, andby AFOSR under Grant FA95500410210, and by the Department of Energy under Award DE-SC0002223.Authors’ addresses: D. He, Department of Systems Engineering and Operations Research, GeorgeMason University, 4400 University Drive, Fairfax, VA 22030; L. H. Lee, Department of Industrialand Systems Engineering, National University of Singapore, 21 Lower Kent Ridge Road, Singa-pore, 119077, Singapore; C.-H. Chen (corresponding author), Department of Systems Engineeringand Operations Research, George Mason University, 4400 University Drive, Fairfax, VA 22030;email: [email protected]; M. C. Fu, Decision, Operations, and Information Technologies Depart-ment, Robert H. Smith School of Business, University of Maryland, College Park, MD 20742; S.Wasserkrug, IBM Haifa Research Lab, Haifa.Permission to make digital or hard copies of part or all of this work for personal or classroom useis granted without fee provided that copies are not made or distributed for profit or commercialadvantage and that copies show this notice on the first page or initial screen of a display alongwith the full citation. Copyrights for components of this work owned by others than ACM must behonored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,to redistribute to lists, or to use any component of this work in other works requires prior specificpermission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 PennPlaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]© 2010 ACM 1049-3301/2010/01-ART4 $10.00DOI 10.1145/1667072.1667076 http://doi.acm.org/10.1145/1667072.1667076

ACM Transactions on Modeling and Computer Simulation, Vol. 20, No. 1, Article 4, Publication date: January 2010.

4:2 • D. He et al.

is developed based on the notion of optimal computing budget allocation. The proposed approachimproves the updating of the sampling distribution by carrying out this computing budget allo-cation in an efficient manner, by minimizing the expected mean-squared error of the CE weightfunction. Numerical experiments indicate that the computational efficiency of the CE method canbe substantially improved if the ideas of computing budget allocation are applied.

Categories and Subject Descriptors: I.6.6 [Simulation and Modeling]: Simulation OutputAnalysis

General Terms: Algorithms

Additional Key Words and Phrases: Simulation optimization, computing budget allocation, cross-entropy method, estimation of distribution algorithms

ACM Reference Format:He, D., Lee, L. H., Chen, C.-H., Fu, M. C., and Wasserkrug, S. 2010. Simulation optimization usingthe cross-entropy method with optimal computing budget allocation. ACM Trans. Model. Comput.Simul. 20, 1, Article 4 (January 2010), 22 pages.DOI = 10.1145/1667072.1667076 http://doi.acm.org/10.1145/1667072.1667076

1. INTRODUCTION

We consider simulation optimization for the setting where the search space isa continuous set, and where the objective function is estimated with noise, forexample, using simulation to estimate the expected value of an output perfor-mance measure of a stochastic system. Traditional approaches to this prob-lem include stochastic approximation, sample average approximation, and re-sponse surface methodology/metamodeling (see Rubinstein and Shapiro [1993],Fu [1994, 2002, 2006], Andradottir [1998], Spall [2003], Swisher et al. [2003],Barton and Meckesheimer [2006], and references therein). More recent ap-proaches include adaptation of random search [Andradottir 2006], nested parti-tions [Shi and Olafsson 2000], and the COMPASS approach of Hong and Nelson[2006]; see also Nelson et al. [2001].

A primary reason that simulation optimization is difficult is the stochasticnature of evaluating the objective function. There is a basic trade-off betweendevoting computational effort on searching the space for new candidate so-lutions (exploration) versus getting more accurate estimates of the objectivefunction at currently promising solutions (exploitation). In other words, howmuch of a simulation budget should be allocated to additional replications atalready visited points and how much to allocate for replications at newly gen-erated points is a major consideration in terms of computational efficiency. Ourwork aims to contribute to answering this question. Note that this differs fromanother important issue of making statistical statements about candidate so-lutions visited during the search, which is addressed in Boesel et al. [2003].

As noted in Fu [2002], commercial software has implemented population-based algorithms from deterministic optimization based on ideas from evolu-tionary approaches such as genetic algorithms and metaheuristics (see Olafsson[2006]) such as tabu search. A more recent development along the lines ofpopulation-based methods is the class of global optimization algorithms calledEstimation of Distribution Algorithms (EDA [Larranaga and Lozano 2001])which work with a probability distribution over the solution space. A population


Simulation Optimization Using the Cross-Entropy Method • 4:3

of candidate solutions is generated from the probability distribution, the solu-tions are evaluated, and then a subset of solutions is used to update the proba-bility distribution in such a way that the distribution concentrates more weighton promising regions. A particularly promising approach along these lines isthe Cross-Entropy (CE) method introduced by Rubinstein [1999], initially forfinding the optimal importance sampling measure in rare event simulation, andthen applied to optimization problems. When all of the preceding approachesare applied to the setting where the evaluation of the objective function (per-formance measure) is noisy, such as in the case of stochastic simulation, thereis the consideration of how to best allocate simulation replications among can-didate solutions.

Theoretical convergence proofs for distribution-based algorithms include thefollowing. For deterministic global optimization, Zhang and Muhlenbein [2004]show the global convergence of EDAs with infinite populations when the distri-bution updated from selected candidate solutions is exactly the sample distribu-tion from which the candidate solutions are generated. Rubinstein and Kroese[2004] and Margolin [2004] prove the asymptotic convergence of variants ofCE for some special cases, also in the deterministic optimization setting. Forsimulation optimization problems, Hu et al. [2006] show the global convergenceof CE-like methods assuming that the number of simulation replications andnumber of candidate solutions at each iteration increases at a particular rate.They also demonstrate that if these parameters are kept fixed, convergence toglobal or local optima cannot be guaranteed.

In this article, we focus on the efficiency issue in simulation optimizationusing the CE method. Traditionally, all of the candidate solutions generated fora given iteration are equally simulated, although there has been some recentwork aimed at efficiency improvements. The most relevant work is Chepuri andHomem-de-Mello [2005], which provides a simple heuristic sampling schemeto determine the number of simulation replications in each iteration for theCE method. For network reliability problems, Kroese and Hui [2006] provide asynchronous construction ranking scheme, which is similar with the techniqueof common random numbers and so can reduce the variance of the estimation.

On the other hand, in the Ranking and Selecting (R&S) literature, thereare several approaches to efficiently allocate simulation replications amongcompeting candidate “designs.” In our setting “designs” correspond to candi-date solutions to the optimization problem. In the R&S setting, all candidatesmust be simulated, so the number of choices has to be relatively small. Mostof the effort has concentrated on finding the best candidate solution or sometop candidates among a fixed set of alternatives (refer to Chen et al. [1997,2000, 2008], Chick and Inoue [2001a, 2001b], He et al. [2007], Kim and Nelson[2006], Branke et al. [2007], Fu et al. [2007]). Among these approaches, theOptimal Computing Budget Allocation (OCBA) introduced in Chen et al. [2000,2008] is most relevant to this article. OCBA intends to maximize a probabilityof correctly selecting the best candidate (or the best subset).

To improve efficiency for simulation optimization problems in which not allthe possible solutions can be simulated, some R&S procedures have been com-bined with population-based algorithms. For example, Shi et al. [1999], Shi and


4:4 • D. He et al.

Chen [2000], and Chew et al. [2008] combine OCBA or multi-objective OCBAwith the nested partitions method, in which in every iteration there is an R-and-S problem formulated. In addition, Chen et al. [2008] offer an efficientprocedure to select a best subset, which is useful for some population-based al-gorithms such as CE method or population-based incremental learning method(PBIL, see Rudlof and Koppen [1996]).

In this article, we investigate a different computing budget allocation ap-proach. With CE as the sampling method, instead of focusing on the probabilityof correctly selecting the best candidate or subset as in typical ranking and se-lection procedures, we consider an objective designed for the performance of CEmethod and intend to find an efficient computing budget allocation so that thisobjective can be optimized. Such a notion has potential to be applied to severalother EDAs, although some further research needs to be done. Specifically, wederive an asymptotically optimal allocation that minimizes the expected mean-squared error of the CE weight function based on the ideas of OCBA appliedseparately to each individual iteration of the CE method. This new allocationscheme is called the Cross-Entropy with Optimal Computing Budget Allocation(CEOCBA) method. We explore such an approach for both standard CE and ex-tended CE methods. The reason for including both standard and extended CE isnot to compare the performances of the two, but to provide two examples of howthe efficiency of the simulation-based CE methods can be enhanced via smartercomputing budget allocation. In certain cases, the asymptotic allocation can beput in close correspondence to the asymptotic allocation that was derived inChen et al. [2008] under a different setting of selecting the top-m subset.

In our approach, CEOCBA is used in the allocation of simulation replica-tions just prior to the parameter updating step at each iteration of the CEmethod. Numerical testing indicates that the resulting integrated procedurecan lead to significant computational efficiency gains for both CE methods ascompared with cases without smarter computing budget allocation. Althoughthe proposed approach is developed for the CE method in this article, the ideasof CEOCBA have the potential to be extended to other population-based evolu-tionary algorithms and distribution-based algorithms such as MRAS [Hu et al.2006, 2007]. In addition, the COMPASS algorithm [Hong and Nelson 2006], aswell as the nested partitions method [Shi and Olafsson 2000], when applied tothe stochastic optimization setting, all require a simulation-allocation rule. Theapproach in this article also provides one alternative way of specifying it. On theother hand, while the development is based on the notion of OCBA, some otherefficient computing budget allocation schemes such as VIP procedure [Chickand Inoue 2001a, 2001b] or the myopic sequential sampling procedures [Chicket al. 2007, 2008] can also be extended to the budget allocation problems for-mulated in this article and similar efficiency gains can be expected. Again, wereiterate that our main objective is not to find the best simulation optimiza-tion algorithm, but rather to demonstrate that the computational efficiency ofpopulation/distribution-based algorithms can be enhanced via a smarter con-trol of simulation budget allocation.

The rest of the article is organized as follows. In Section 2, we introducethe simulation optimization problem setting, and briefly summarize the CE



approach to finding a globally optimal solution. Section 3 formulates theCEOCBA problem corresponding to each iteration of the CE method, and inSection 4 an optimal allocation is found using the Karush-Kuhn-Tucker (KKT)conditions and asymptotic analysis, from which a heuristic allocation procedureis proposed for incorporating into the standard and an extended version ofthe CE method. Numerical experiments comparing the implementations withequal allocation are provided in Section 5. Section 6 gives some conclusionsfrom the work.

2. SIMULATION OPTIMIZATION PROBLEM SETTING

We introduce the following notation.χ : continuous search space.X : a feasible solution in χ .X i : the ith feasible solution.S(X ) : the mean performance for the solution X .ω j (X ) : the simulation noise at the j th simulation replication. We assume

ω j (X ) is independently and normally distributed with zero meanand finite variance.

S j (X ) : the sampled performance of X estimated at the j th simulationreplication, that is, S j (X ) = S(X ) + ω j (X ) and E[S j (X )] = S(X ).

N : the number of simulation replications.SN (X ) : the sample mean after N simulation replications, that is, SN (X ) =

1N

∑Nj=1 S j (x).

The general optimization problem can be expressed as follows.

minX ∈χ

S(X ) (1)

In the problem that we are interested in, the performance S(X ) can only beestimated by running simulations using the sample mean estimator SN (X ). AsN → ∞, SN (X ) → S(X ). While it is impossible to have an infinite numberof simulation replications in practice, as N increases, SN (X ) becomes a betterestimator of S(X ). There is a trade-off between devoting computational efforton searching or sampling the space χ for more candidate solutions X versusgetting more accurate estimates of S(X ) by taking more simulation replica-tions on solution X . There is some literature discussing on how to deal withthe trade-off (e.g., see Lin and Lee [2006], Lee et al. [2006], Fu et al. [2006]).However, most discussions are quite broad and are not designed specificallyfor population-based or distribution-based methods. In this article, we focus onthe framework of population-based and distribution-based methods and inves-tigate how we should allocate the simulation budget in a more efficient wayfor one good exemplary method: the CE method. Although there are differentversions of CE, the main objective of this article is to show that the efficiency ofthe CE method can be significantly enhanced via efficient simulation allocation,rather than to identify a best CE method.

The normality assumption of ω j (X ) is typically satisfied in simulation, be-cause the output is obtained from an average performance or batch means,


4:6 • D. He et al.

so that Central Limit Theorem effects usually hold [Bechhofer et al. 1995].However, Glynn and Juneja [2004] show the normality assumption may resultin suboptimal computing budget allocation when the normality assumption isviolated. Our numerical example in Section 5 shows that the proposed approachstill works very well when the simulation noise follows a uniform distribution.

2.1 Cross-Entropy Method

The Cross-Entropy (CE) method was originally used for finding the optimalimportance sampling measure in the context of estimating rare event probabil-ities. Later it was developed into a technique for solving optimization problem.In this section, we will only provide a brief introduction to the CE method. SeeRubinstein and Kroese [2004] for more technical details on the method.

We begin with the deterministic optimization case. As mentioned earlier,the CE method works with a parameterized probability distribution. In everyiteration of CE, we will first generate a population of solutions from a probabilitydensity function (pdf) with a certain parameter. After all the solutions in thispopulation have been evaluated, we will select the elite solutions. These elitesolutions will then be used to update the parameters of this pdf, which will beused to generate the population for the next iteration.

In every iteration, the parameters of the distribution are updated by min-imizing the Kullback-Leibler (KL) divergence (or the cross-entropy) betweenthe parameterized pdf and the target optimal pdf. This is given by the follow-ing formula (see Rubinstein and Kroese [2004]). We have

vt+1 = arg maxv

kt∑i=1

Ii∈�t ln p(X i, v), (2)

where vt+1 is the parameter vector learned at iteration t, kt is the number ofsolutions generated at iteration t, I is the indicator function, �t is the elite set,and p(X i, v) is the pdf of generating solution X i when the parameter vector ofthe distribution is v. Eq. (2) can be interpreted as using “elite” solutions to fitthe sample distribution by using maximum likelihood estimation.

There are several ways to determine the elite set �t . One of the commonways is given by �t = {X i : S(X i) < γt}, where S(X i) is the performance ofsolution X i and γ t is the threshold to determine the elite set. One can chooseto have a fixed number of top solutions in the elite set �t [Rubinstein andKroese 2004], for example, the top-m solutions. Thus γ t can be set as (S(X [m])+S(X [m+1])/2, where S(X [m]) is the performance of the solution which has a rankof m. Alternatively, Hu et al. [2006] gradually raise the threshold γ t as thesearch proceeds (γ t decreases as t increases in the minimization problem). Inour numerical experiments, we will take the first approach by choosing the top-m solutions as the elite set. However, our CEOCBA derivation applies to bothcases when γ t is given at each iteration.

In the stochastic case, the parameter vector vt is updated using Eq. (2) exceptthat S(X i) is replaced by its sample mean SNt (X i). To simplify the notation,we use Si to denote SNi (X i) in the remainder of the article. An extension orgeneralization of this “standard” CE method adds a weight function to (2), so



that given γ t at iteration t, the parameter is updated by solving Eq. (3).

vt+1 = arg maxv

∑i

w(Si)I{Si<γt } �n p(X i, v). (3)

In this article, we will consider a weight function suggested in Hu et al.[2007], that is, w(z) = e−rz (where r > 0 for minimization problems and r < 0for maximization problems). Note that when the weight function w(Si) = 1,Eq. (3) reduces to standard CE.

To summarize, an algorithm for the CE method is as follows.

Algorithm. CE Method

INPUT k1, γ 1, pdf p, v1.INITIALIZATION Initialize a parameterized pdf p(·, v1). Set t = 1.GENERATION Randomly generate a set of solutions X 1, . . . , X kt from p(·, vt)

at iteration t.SIMULATION Simulate Ni replications for each X i to get objective function

estimates S1(X i), S2(X i), . . . , SNi (X i). Compute the samplemean Si . The total number of replication runs at iteration t isTt = ∑k

i=1 Ni .UPDATING Update the parameter vector vt+1 by solving (2) or (3).STOP If stopping criteria are satisfied, then stop; otherwise set

t = t + 1 and go back to step Generation.

The choices of kt and Tt will be discussed in the numerical experiments inSection 5. In the next two sections, we drop the iteration subscript on k, T , andγ to simplify notation. The parameterized distribution p(*, v) is generally cho-sen to be from the exponential family, although the derivation of the efficientcomputing budget allocation rule shown in the next section is independent ofthe choice of the distribution p. In the numerical experiments, we used thetruncated (to the feasible region of the problem) multivariate normal (Gaus-sian) distribution. Under this case, vt = (μt , �t) consists of a mean vector μtand covariance matrix �t . From Rubinstein [2004] and Hu et al. [2006], theparameters updating for standard CE (w(Si) = 1) are given as follows.

μt = 1m

∑i:Si<γ

X i, and

�t = 1m

∑i:Si<γ

(X i − μt)(X i − μt)T

For extended CE,

μt =∑

i:Si<γ e−r Si X i∑i:Si<γ e−r Si

and �t =∑

i:Si<γ e−r Si (X i − μt)(X i − μt)T∑i:Si<γ e−r Si

.

In the step of updating, one may apply a smoothing procedure to enhance nu-merical stability. In our numerical test, we use the linear parameter smoothing


4:8 • D. He et al.

procedure provided by Rubinstein [1999] and Hu et al. [2006]. It is given by

vt+1 = αvt+1 + (1 − α)vt ,

where vt+1 is the original parameter estimation using (2) or (3) and α ∈ (0,1] isthe smoothing parameter. Note that a large smoothing factor will lead to fastconvergence, while a small smoothing factor will allow more exploration andhence increase the chance that the algorithm converges to the global optimumrather than prematurely converging to a local optimum [Rubinstein and Kroese2004; Hu et al. 2006].

3. CEOCBA FORMULATION

In applying the CE method to stochastic problems, the accuracy of simulationoutput has significant impact on updating the parameter vector v. The tradi-tional approach in the simulation step has been to allocate simulation replica-tions equally to each candidate solution generated, that is, Ni = Tj /k j for all iat iteration j (assuming k j divides Tj ). For this step, some efficient ranking andselection procedures can be applied to enhance the computational efficiency forthe CE algorithm. Instead of equally simulating all solutions, a larger portionof the computing budget should be allocated to those candidate solutions thatplay a more important role in the updating step. As discussed shortly, we for-mulate an optimization problem for allocating the computing budget so that theexpected mean-squared error of the CE weight function is minimized. Basedon the solution to this optimization problem, we propose a new CE algorithmcalled CEOCBA.

The updating step given in Eq. (3) is the crucial step for the (extended) CEmethod, since the updated parameters will ultimately guide where the newcandidate solutions will be generated in the next iteration. For the stochasticcase, the term w(Si)I{Si<γ } in (3) depends on the sample mean Si and so the noiseof simulation output. Thus its variability is impacted by the simulation budgetallocated to X i. The term “ln p(X i, ν)” does not depend on the noise of simulationoutput nor the simulation budget allocation. To reduce the potential impactof simulation error on the search process, we want to allocate the simulationreplications so that the simulation estimate w(Si)I{Si<γ } is close to the termw(Si)I{Si<γ }without simulation noise for all candidate solutions. Ideally we wantthe difference between these two terms as small as possible, and so we chooseto minimize the mean-squared error. The CEOCBA optimization problem canbe defined as follows.

At each iteration, for a given computing budget T and a given γ , we wouldlike to assign Ni simulation replications to solution X i to minimize the expectedmean-squared error.

minN1, . . . , Nk

E

[1k

k∑i=1

(w(Si)I{Si<γ })−w(Si)I{Si<γ })2

],

Such thatk∑

i=1

Ni = T. (4)



In the next section, we derive an asymptotically optimal allocation forproblem (4) and propose an easy-to-implement heuristic sequential allocationprocedure.

4. ASSUMPTIONS AND DEVELOPMENT OF CEOCBA PROCEDURES

The CEOCBA problem given by (4) includes both the standard and extendedversions of the CE method. The standard CE method essentially considers onlythe relative ordering of the candidate solutions, whereas the extended versiontakes into account the objective function value. Hence the derived computingbudget allocations are different.

The CEOCBA formulation assumes that the noise in the objective functionestimates is independently and normally distributed with zero mean, that is,ω(X i) ∼ N (0, σ 2

i ), so

S j (X i) ∼ N(Si, σ 2

i

)and Si ∼ N

(Si,

σ 2i

Ni

).

In the derivation of the asymptotic allocation rule, in addition to assumingknown γ , we also assume that the means Si and the standard deviations σ i areknown.

Denote �(·) as the cdf of the standard normal distribution and denote f si (·)as the pdf of Si. The Lagrangian function for problem (4) can be defined as

F =k∑

i=1

ESi

[(w(Si)I{Si<γ } − w(Si)I{Si<γ })2] − λ

(k∑

i=1

Ni − T

)

=k∑

i=1

∫ ∞

−∞

(w(x)I{x<γ } − w(Si)I{Si<γ }

)2 f si (x)dx − λ

(k∑

i=1

Ni − T

). (5)

We will then derive the asymptotic rule based on Eq. (5) by assuming T → ∞,and Ni is continuous.

4.1 Standard CE

For the standard CE method, we have the following result.

THEOREM 1. An asymptotically (as T → ∞) globally optimal solution forCEOCBA problem (4), where w(·) = 1, is given by

Ni

Nj=

(σi(γ − Sj )σ j (γ − Si)

)2

. (6)

PROOF. See Appendix A.

The asymptotic allocation in Eq. (6) can be put in close correspondence tothe asymptotic allocation that was derived in Chen et al. [2008], that is, theallocations can be made equal by a special choice of the parameters. However,these allocations were derived under different settings using different objec-tives. Specifically, the asymptotic allocation in Chen et al. [2008] is obtainedby maximizing the probability of correctly selecting the top-m candidates in a


4:10 • D. He et al.

pure ranking-and-selection setting, that is, outside of the context of any partic-ular optimization algorithm such as the CE method, whereas the asymptoticallocation in Theorem 1 is obtained by minimizing the expected mean-squarederror of the CE weight function using the top-m candidate solutions.

The allocation rule in Theorem 1 is based on the assumption that the distribu-tion for the simulation error is normal with known means and known variancesand a given γ . As discussed earlier, there are many ways to choose γ . In thisarticle, given all the known means, we set γ equal to (S(x[m]) + S(x[m+1]))/2.

In practice, as means and variances are unknown, we employ a heuristicapproach to tackle these problems. A sequential procedure is used to approxi-mately estimate these means and variances, and also γ . Each candidate so-lution is initially simulated with n0 replications in the first stage, and ad-ditional replications are allocated to individual solutions incrementally from replications to be allocated in each subsequent stage until the simulationbudget T is exhausted. When T → ∞, since every solution has been allo-cated with simulation runs infinitely often, we can show that the allocationrule using the sample average and sample standard deviation will convergeto the rule using true means and true variances. Even though these samplestatistics and γ are varying over iterations, the impact of these approxima-tions will decay asymptotically. In summary, we have the following heuristicalgorithm.

Algorithm. CEOCBA Procedure

INITIALIZE l ← 0; Perform n0 simulation replications for all k candidatesolutions; Nl

1 = Nl2 = · · · = Nl

k = n0; Tl = kn0.LOOP WHILE T l < T DO

UPDATE Calculate sample means Si , sample variances

σ 2i ≡ 1

Nli −1

∑Nli

j=1

(S j (X i) − Si

)2, i = 1, . . . , k, and γ = S[m]+S[m+1]

2

based on simulation outputs, where S[m] denotes the m-th orderstatistic.

ALLOCATE Tl+1 = Tl + and calculate the new budget allocation

(Nl+11 , Nl+1

2 , . . . , Nl+1k ) which (approximately) satisfies Nl+1

i

Nl+1j

=(σi (γ−S j )σ j (γ−Si )

)2,∑k

i=1 Nl+1i = Tl+1 and

∑ki=1 (Nl+1

i − Nli )+ = ,

where y+ = max( y , 0).SIMULATE Perform additional (Nl+1

i − Nli )+ replications for candidate solution i,

i = 1, . . . , k; l ← l + 1.END OF LOOP

Remark 1. The resulting Ni in the Allocate step is a continuous numberthat must be rounded to an integer. In the numerical experiments in the nextsection, Ni is rounded to the nearest integer such that the summation of addi-tional simulation replications for all solutions equals . This is simply for easeof computing budget management in numerical testing, providing a fair com-parison with other allocation procedures. Note that there may not always exist



a solution that satisfies all the three constraints. It actually occurs when atleast one solution has been oversimulated, that is, Nl+1

i < Nli . In this case, we

have to relax the constraint. For ease of control of the simulation experiment,we choose to maintain the constraint

∑ki=1 Nl+1

i = Tl+1 and apply a heuristicto round Nl+1

i for all i to nearest integers. We have found numerically that theperformance is not sensitive to how we round Ni, probably due to the robustnessof a sequential procedure.

Remark 2. The allocation given in Theorem 1 assumes known means andvariances. The preceding sequential algorithm estimates these quantities usingthe updated sample variances. As more simulation replications are iterativelyallocated to each solution, the estimation improves. A good selection of the pa-rameters n0 and is problem specific (see Chen et al. [2000, 2008] for moreextensive discussions). In general, to avoid poor estimation at the beginning,n0 should not be too small. Also, it is wise to avoid large to prevent a poorallocation before a correction can be made in the next iteration, which is partic-ularly important in the early stages. Our numerical testing indicates that theperformance of the proposed procedure is not sensitive to the choice of n0 and if these guidelines are followed, and the impact of approximating mean andvariance by sample mean and sample variance is not significant.

4.2 Extended CE

In this section, we present the asymptotically optimal allocation for extendedCE under the exponential function w(x) = e−rx , where r> 0 for minimizationproblems and r < 0 for maximization problems. The approach can be extendedto other classes of functions.

THEOREM 2. If w(x) = e−rx (r > 0 for minimization problems and r < 0for maximization problems), an asymptotically (as T → ∞) globally optimalsolution for CEOCBA problem (4) is given by

—Ni

N j= e−rSi σi

e−rS j σ j, i, j ∈ {q : Sq<γ |q = 1, . . . , k}, (7)

—Ni

N j=

(σi(γ − Sj )σ j (γ − Si)

)2

, i, j /∈ {q : Sq < γ |q = 1, . . . , k} (8)

—

∑i:Si<γ Ni∑j :S j ≥γ N j

∼ O(

ebT

T 2

), (9)

where b is a positive constant.

PROOF. See Appendix B.

Note that r = 0 and (9) imply that the total budget allocated to the solutionsin elite set will be exponentially larger than the total budget allocated to thenonelite set. In addition, the total amount of budget allocated to the noneliteset will approach infinity when the total budget goes to infinity.


4:12 • D. He et al.

For the extended version, the CEOCBA allocation procedure is the same asthat presented in Section 4.1, except that the following Eqs. (7) through (9) areused in place of Eq. (6) for the allocation step of the algorithm.

ALLOCATE Tl+1 = Tl + and calculate the new budget allocation,Nl+1

1 , Nl+12 , . . . , Nl+1

k , such that(i) i, j ∈ {q : Sq < γ |q = 1, . . . , k}:

Nl+1i

Nl+1j

= e−r Si σi

e−r S j σ j,

(ii) i, j /∈ {q : Sq < γ | q = 1, . . . , k }:Nl+1

i

Nl+1j

=(

σi(γ − S j )σ j (γ − Si)

)2

,

(iii)

∑i:Si<γ Nl+1

i∑j :S j ≥γ Nl+1

j

= ebTl+1

(Tl+1)2,

k∑i=1

Nl+1i = Tl+1 and

k∑i=1

(Nl+1

i − Nli

)+ = .

The extended CE version introduces a new parameter b. For implementation,we recommend that 0 < b < 0.1, to avoid poor estimation on the nonelite set. Avalue of b chosen too small may result in spending too much computing budgeton the nonelite set, whereas choosing a value of b too large may have negativeimpact on determination of the elite set. In our numerical experiments, we setb = 0.01.

5. NUMERICAL EXPERIMENTS

In this section, we compare the computational performance of the following fouralgorithms on several examples.

—Standard CE with Equal Allocation;—Standard version of CEOCBA procedure in Section 4.1;—Extended CE with Equal Allocation; and—Extended version of CEOCBA procedure in Section 4.2.

5.1 Algorithm Parameter Setting

The algorithm includes several parameters whose values must be chosen aheadof time. Some have been well studied in the CE literature (refer to Rubinsteinand Kroese [2004] and Hu et al. [2006]); while some are explored in the rankingand selection or OCBA literature (refer to Chen et al. [2000, 2008]). In general,good setting of these parameters is generally problem specific. As summarizedin the following table, we give the values we use in our numerical examplesand offer some general suggestions for typical problems. further details can befound in corresponding references.



Table I. Summary of Major Parameters Used in the CEOCBA Algorithms

Chosen Value General Commentskt kt = �1.04 ∗ kt−1 , with

k1 = 500Preferably nondecreasing in the iteration number. Should notbe too small for large dimensional problems.

Tt Tt = �1.1 ∗ Tt−1 , withT1 = 10000

Generally nondecreasing in the iteration number to ensureconvergence. T Should be bigger than n0 ∗ k.

m 10% * k Small m (relative to k) may result in premature convergenceto local optima, while large m may cause slower convergence.

α 0.5 α ∈ (0,1]. α = 1 implies nonsmoothing, which may causenumerical instability. Small α Generally result in slowerconvergence.

γ (S(X [m]) + S(X [m+1]))/2 In CE literature, fixing the size of the elite set (m) is a commonchoice. From computing budget allocation perspective, we donot want γ to be too close to either S(x[m]) or S(x[m+1]) tocause allocation instability.

n0 10 n0 should be between 5 and 20. Not sensitive if within thisrange. If n0 is too small, it may end up with numerical insta-bility at the beginning of the simulation due to poor estimateof sample statistics.

100 should not be too big in the sequential procedure. Notsensitive.

b 0.01 See Section 4.2.

5.2 Numerical Experiments

The following four well-known continuous deterministic optimization functionswere tested. Graphs of the two-dimensional versions are shown in Figures 1through 4. For the stochastic portion, the standard deviation of noise was setat σ i = 10 in all four problems, that is, S(X i) ∼ N (Si, 102).

For initialization at t = 1, μt is randomly generated using a uniform distri-bution over the entire feasible region, which is bounded. �t is initially set as100I , where I is the identity matrix (i.e., the variances and covariances are 100and 0, respectively.)

(1) Goldstein-Price function: (2D)

S(X ) = (1 + (x1 + x2 + 1)2(19 − 14x1 + 3x2

1 − 14x2 + 6x1x2 + 3x22

))(30 + (2x1 − 3x2)2(18 − 32x1 + 12x2

1 + 48x2 − 36x1x2 + 27x22

))where X = (x1, x2), −3 ≤ xi ≤ 3, i = 1, 2.

The unique global minimum is X ∗ = (0, −1), with minimum value S(X *) =3. There are also four local minima in the given feasible region, and the regionis relatively flat around these minima.

(2) Rosenbrock function: (5D)

S(X ) =4∑

i=1

100(xi+1 − x2

i

)2 + (xi − 1)2 + 1

where X = (x1, . . . , x5), −10 ≤ xi ≤ 10, i = 1, . . . , 5.


4:14 • D. He et al.

Fig. 1. 2D Goldstein-Price function.

Fig. 2. 2D Rosenbrock function.

Fig. 3. 2D Griewank function.

The unique global minimum is X ∗ = (1, 1, 1, 1, 1), with minimum valueS(X ∗) = 1. This function is famous for its banana-shaped valley that makesfinding the minimum particularly difficult.

(3) Griewank function: (2D)

S(X ) = 140

(x2

1 + x22

) − cos (x1) cos(

x2√2

)+ 2

where X = (x1, x2), −10 ≤ xi ≤ 10, i = 1, 2.



Fig. 4. 2D Pinter function.

The unique global minimum is X ∗ = (0, 0), with minimum value S(X ∗) = 1.This is a highly symmetric function with many local minima.

(4) Pinter function: (5D)

S(X ) =5∑

i=1

ix2i +

5∑i=1

20i sin2(xi−1 sin xi − xi + sin xi+1)

+5∑

i=1

i log10(1 + i

(x2

i−1 − 2xi + 3xi+1 − cos xi + 1)2) + 1

where X = (x0, . . . , x6), x0 = x5, x6 = x1, −10 ≤ xi ≤ 10, i = 1, . . . , 5.The unique global minimum is X ∗ = (0, 0, . . . , 0), with minimum value

S(X ∗) = 1. Again, this function has many local minima.The results are given in Figures 5 through 8, where the horizontal axis gives

the total computing budget (number of simulation replications) and the verticalaxis gives the average of S(X ∗) over 100 independent macroreplication experi-ments, where X ∗ is the best solution found thus far based on sample averagesof the simulated objective function. The solid lines indicate the performance us-ing standard CE, whereas dashed lines show the performance using extendedCE. Thick lines indicate the performance with integration of OCBA, whereasthe thin lines show the performances without OCBA.

Note that the numerical setting is favorable for equal allocation, since theobjective function estimation variances are equal across the entire domain in allof the examples. Nonetheless, the CE procedure that integrates OCBA outper-forms the equal allocation counterpart in all cases. On the other hand, extendedCE performs slightly better than standard CE in some but not all examples.

(5) Modified Griewank function: (2D)In this experiment, we consider a different case in which we increase the

difference between the global minimum and the local minima. In particular,1 − e−(x2

1+x22 ) is added to the 2D Griewank function of experiment 3, leading to

a function whose global minimum is still located at (0, 0) as in the originalGriewank function. Note that 1 − e−(x2

1+x22 ) quickly increases from 0 to 1 as we


4:16 • D. He et al.

Fig. 5. Numerical results for 2D Goldstein-Price function.

Fig. 6. Numerical results for 5D Rosenbrock function.

move away from the global minimum (0, 0). Thus the entire function is increasedby 1 except the small area near the global minimum. The modified function is

S(X ) = 140

(x2

1 + x22

) − cos (x1) cos(

x2√2

)− e−(x2

1+x22 ) + 3,

where X = (x1, x2), −10 ≤ xi ≤ 10, i = 1, 2. Its graph is shown in Figure 9.The numerical results are shown in Figure 10. The efficiency gain using

CEOCBA is more significant in this case, because when the difference between



Fig. 7. Numerical results for 2D Griewank function.

Fig. 8. Numerical results for 5D Pinter function.

the global minimum value and other local minima values increases, the differ-ences in simulation allocations among different candidate solutions will alsoincrease, so an optimal allocation is even farther away from equal allocation,allowing CEOCBA to achieve better efficiency.

(6) Griewank function with non-Gaussian noise.Normality is assumed in the derivation of the CEOCBA allocation. To test

the robustness of the allocation, we consider a nonnormal distribution, using


4:18 • D. He et al.

Fig. 9. Modified 2D Griewank function.

Fig. 10. Numerical results for modified 2D Griewank function.

the same Griewank function in experiment 3 except that the simulation noiseis changed to the uniform distribution U(−17.32, 17.32), which has the samemean and variance as the normal noise in experiment 3. Figure 11 containsthe simulation results for the four allocation procedures. We can see that therelative performances of the different procedures are very similar to the resultsin experiment 3. The CE procedure that integrates OCBA outperforms the equalallocation counterpart, as both standard CE and extended CE with OCBA arealmost three times faster than those without OCBA.



Fig. 11. Numerical results for 2D Griewank function with uniform distributed additive noise.

Fig. 12. Numerical results for 50D griewank function.

(7) Griewank function: (50D)We increase the dimension in experiment 3 from 2 to 50. Since this is a harder

problem, we increase the initial k1 = 1000, T1 = 20000. Other parameters areunchanged. The results are given in Figure 12. Again, CEOCBA outperformsCE without OCBA.


4:20 • D. He et al.

6. CONCLUSIONS

The CE method has shown to be promising in solving difficult global optimiza-tion problems, but its main focus has been on deterministic optimization prob-lems. For the stochastic setting, there has been no work on the allocation ofsimulation replications, which in real applications is likely to be the dominantcomputational cost. In this article, we have presented a method that can im-prove the efficiency of the CE method when applied to simulation optimizationproblems. The integrated procedure integrates the objectives of minimizing theKL divergence from a parameterized distribution that generates the candidatesolutions in the CE method with that of minimizing the total computing bud-get per iteration. Numerical testing on the standard and extended CE meth-ods indicate that the new integrated approach is quite promising, resultingin substantial computational efficiency gains over the CE method with equalallocation. However, we should point out that the CEOCBA approach cannotimprove the search portion of the algorithm, so that for a particular problemthat in its deterministic form poses difficulties for the search procedure, inte-grating the OCBA approach is not likely to help the procedure in its stochasticversion. The development of an effective approach that can both optimize thesimulation budget allocation at each allocation and improve the CE search ef-ficiency remains an open question. The research issues include how to utilizethe sample information in setting γ , T , and k dynamically.

REFERENCES

ALLON, G., KROESE, D., RAVIV, T., AND RUBINSTEIN, R. 2005. Application of the cross-entropy methodto the buffer allocation problem in a simulation-based environment. Ann. Oper. Res. 134, 15,137–151.

ANDRADOTTIR, S. 1998. Simulation optimization. In Handbook of Simulation: Principles, Method-ology, Advances, Applications, and Practice, J. Banks Ed., John Wiley and Sons, New York, Chap-ter 9.

ANDRADOTTIR, S. 2006. An overview of simulation optimization with random search. In Handbooksin Operations Research and Management Science: Simulation, S. G. Henderson and B. L. NelsonEds., Elsevier, Chapter 20, 617–632.

BARTON, R. R. AND MECKESHEIMER, M. 2006. Metamodel-Based simulation optimization. In Hand-books in Operations Research and Management Science: Simulation, S. G. Henderson and B. L.Nelson Eds., Elsevier, Chaper 18, 535–574.

BECHHOFER, R. E., SANTNER, T. J., AND GOLDSMAN. D. M. 1995. Design and Analysis of Experimentsfor Statistical Selection, Screening, and Multiple Comparisons. John Wiley and Sons.

BOESEL, J., NELSON, B. L., AND KIM, S.-H. 2003. Using ranking and selection to ‘clean up’ aftersimulation optimization. Oper. Res. 51, 814–825.

BRANKE, J., CHICK, S. E., AND SCHMIDT, C. 2007. Selecting a selection procedure. Manag. Sci. 53,12, 1916–1932.

CHEN, H. C., CHEN, C. H., DAI, L., AND YUCESAN. E. 1997. New development of optimal computingbudget allocation for discrete event simulation. In Proceedings of the Winter Simulation Confer-ence. 334–341.

CHEN, C.-H., LIN, J., YUCESAN, E., AND CHICK, S. E. 2000. Simulation budget allocation for furtherenhancing the efficiency of ordinal optimization. Discr. Event Dynam. Syst. 10, 3, 251–270.

CHEN, C. H., HE, D., FU, M. C., AND LEE, L. H. 2008. Efficient simulation budget allocation forselecting an optimal subset. INFORMS J. Comput. 20, 579–595.

CHEW, E. P., LEE, L. H., TENG, S. Y., AND KOH, C. H. 2008. Differentiated service inventory opti-mization using nested partitions and MOCBA. Comput. Oper. Res. To appear.



CHEPURI, K. AND HOMEM-DE-MELLO, T. 2005. Solving the vehicle routing problem with stochasticdemands using the cross entropy method. Ann. Oper. Res. 134, 153–181.

CHICK, S., BRANKE, J., AND SCHMIDT, C. 2007. New greedy myopic and existing asymptotic sequen-tial selection procedures: Preliminary empirical results. In Proceedings of the Winter SimulationConference. 289–296.

CHICK, S., BRANKE, J., AND SCHMIDT, C. 2008. New myopic sequential sampling procedures.INFORMS J. Comput. To appear.

CHICK, S. AND INOUE, K. 2001a. New two-stage and sequential procedures for selecting the bestsimulated system. Oper. Res. 49, 1609–1624.

CHICK, S. AND INOUE, K. 2001b. New procedures to select the best simulated system using commonrandom numbers. Manag. Sci. 47, 8, 1133–1149.

FU, M. C. 1994. Optimization via simulation: A review. Ann. Oper. Res. 53, 199–248.FU, M. C. 2002. Optimization for simulation: Theory vs. practice. INFORMS J. Comput. 14, 3,

192–215.FU, M. C, HU, J., AND MARCUS, S. I. 2006. Model-Based randomized methods for global optimiza-

tion. In Proceedings of the 17th International Symposium on Mathematical Theory of Networksand Systems. 355–363.

FU, M., CHEN, C. H., AND SHI, L. 2008. Some topics for simulation optimization. In Proceedings ofthe Winter Simulation Conference. 27–38.

FU, M. C., J. HU, J. Q., CHEN, C. H., AND XIONG, X. 2007. Simulation allocation for determining thebest design in the presence of correlated sampling. INFORMS J. Comput. 19, 1, 101–111.

GLYNN, P. W. AND JUNEJA, S. 2004. A large deviations perspective on ordinal optimization. InProceedings of the Winter Simulation Conference. IEEE Press, 577–585.

HE, D., CHICK, S. E., AND CHEN, C. H. 2007. The opportunity cost and ocba selection proceduresin ordinal optimization. IEEE Trans. Syst., Man, Cybernet. C. 37, 5, 951–961.

HONG, L .J. AND NELSON, B. L. 2006. Discrete optimization via simulation using COMPASS. Oper.Res. 54, 115–129.

HU, J., FU, M. C., AND MARCUS, S. I. 2006. A model reference adaptive search algorithm for stochas-tic global optimization. Working paper.http://www.rhsmith.umd.edu/faculty/mfu/fu files/HFM06b.pdf.

HU, J., FU, M. C., AND MARCUS, S. I. 2007. A model reference adaptive search algorithm for globaloptimization. Oper. Res. 55, 3, 549–568.

KOTECHA, J. H. AND DJURIC, P. M. 1999. Gibbs sampling approach for generation of truncatedmultivariate Gaussian random variables. In Proceedings of the IEEE International Conferenceon Acoustics, Speech, and Signal Processing (ICASSP). IEEE Computer Society, 1757–1760.

KROESE, D. P. AND HUI, K.-P. 2006. Applications of the cross-entropy method in reliability. InComputational Intelligence in Reliability Engineering, G. Levitin Ed., Springer, Chapter 3.

KIM, S.-H. AND NELSON, B. L. 2006. Selecting the best system. In Handbooks in Operations Re-search and Management Science: Simulation, S. G. Henderson and B. L. Nelson Eds., Elsevier,Chapter 18.

LARRANAGA, P. AND LOZANO, J. A. 2001. Estimation of Distribution Algorithms: A New Tool forEvolutionary Computation. Springer.

LAW, A. M. AND KELTON, D. M. 1999. Simulation Modeling and Analysis 3rd Ed. McGraw-HillHigher Education.

LIN, X. AND LEE, L. H. 2006. A new approach to discrete stochastic optimization problems. Eur.J. Oper. Res. 172, 761–782.

LEE, L. H., CHEW, E. P., AND MANIKAM, P. 2006. A general framework on the simulation-basedoptimization under fixed computing budget. Eur. J. Oper. Res. 174, 1828–1841.

NELSON, B., SWANN, J., GOLDSMAN, D., AND SONG, W. 2001. Simple procedures for selecting the bestsimulated system when the number of alternatives is large. Oper. Res. 49, 6, 950–963.

OLAFSSON, S. 2006. Metaheuristics. In Handbooks in Operations Research and Management Sci-ence: Simulation, S. G. Henderson and B. L. Nelson Eds., Elsevier, Chapter 21, 633–654.

RUBINSTEIN, R. Y. 1999. The cross-entropy method for combinatorial and continuous optimization.Method. Comput. Appl. Probab. 2, 127–190.

RUBINSTEIN, R. Y. AND SHAPIRO, A. 1993. Discrete Event Systems: Sensitivity Analysis and Stochas-tic Optimization via the Score Function Method. John Wiley and Sons, New York.


4:22 • D. He et al.

RUBINSTEIN, R. Y. AND KROESE, D. P. 2004. The Cross-Entropy Method: A Unified Approach toCombinatorial Optimization, Monte Carlo Simulation and Machine Learning. Springer.

RUDLOF, S. AND KOPPEN, M. 1996. Stochastic hill climbing with learning by vectors of normaldistributions. In Proceedings of the 1st Online Workshop on Soft Computing. 60–70.

SHI, L. AND OLAFSSON, S. 2000. Nested partitions method for global optimization. Oper. Res. 48,390–407.

SPALL, J. C. 2003. Introduction to Stochastic Search and Optimization. John Wiley and Sons,New York.

SWISHER, J. R., JACOBSON, S. H., AND YUCESAN, E. 2003. Discrete-Event simulation optimizationusing ranking, selection, and multiple comparison procedures: A survey. ACM Trans. Model.Comput. Simul. 13, 134–154.

Received November 2007; revised January 2009; accepted March 2009


simulation optimization using the cross-entropy...

Documents