selection procedures for simulations with multiple constraints under independent and correlated...

14

Selection Procedures for Simulations with Multiple Constraintsunder Independent and Correlated Sampling

CHRISTOPHER HEALEY, Schneider ElectricSIGRUN ANDRADOTTIR and SEONG-HEE KIM, Georgia Institute of Technology

We consider the problem of selecting the best feasible system with constraints on multiple secondary per-formance measures. We develop fully sequential indifference-zone procedures to solve this problem thatguarantee a nominal probability of correct selection. In addition, we address two issues critical to the effi-ciency of these procedures: namely, the allocation of error between feasibility determination and selection ofthe best system, and the use of Common Random Numbers. We provide a recommended error allocation as afunction of the number of constraints, supported by an experimental study and an approximate asymptoticanalysis. The validity and efficiency of the new procedures with independent and CRN are demonstratedthrough both analytical and experimental results.

Categories and Subject Descriptors: I.6 [Simulation and Modeling]; I.6.6 [Simulation Output Analysis];I.6.8 [Types of Simulation]

General Terms: Simulation, Ranking and Selection

Additional Key Words and Phrases: Constraints, common random numbers, fully sequential algorithms,multiple performance measures

ACM Reference Format:Christopher Healey, Sigrun Andradottir, and Seong-Hee Kim. 2014. Selection procedures for simulationswith multiple constraints under independent and correlated sampling. ACM Trans. Model. Comput. Simul.24, 3, Article 14 (March 2014), 25 pages.DOI: http://dx.doi.org/10.1145/2567921

1. INTRODUCTION

Evaluating the performance of stochastic systems can be difficult, requiring simulationof the systems and analysis of the resulting outputs. Ranking and Selection (R&S) is astatistical tool for identifying the system with the largest or smallest mean performancemeasure out of a number of alternatives. R&S procedures must be constructed with afocus on both efficiency (the number of observations required to find the best system),as sampling may be expensive, as well as validity (the probability of correctly selectingthe best system, PCS), as outputs are stochastic.

Procedures for R&S take a few different forms, namely indifference-zone (IZ)approaches [Dudewicz and Dalal 1975; Rinott 1978; Kim and Nelson 2001, 2006],Bayesian approaches [Chick 2006; Frazier and Kazachkov 2011], and optimal com-puting budget allocation (OCBA) methods [Chen 1996; Chen et al. 2000]. We are

This research was supported by NSF Grants CMMI-0400260 and CMMI-0644837. The second author wasalso supported by NSF Grant CMMI-0856600.Authors’ addresses: C. Healey, 85 Rangeway Rd., North Billerica, MA, Schneider Electric; S. Andradottir andS.-H. Kim, H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology,Atlanta, GA.Permission to make digital or hard copies of part or all of this work for personal or classroom use is grantedwithout fee provided that copies are not made or distributed for profit or commercial advantage and thatcopies show this notice on the first page or initial screen of a display along with the full citation. Copyrights forcomponents of this work owned by others than ACM must be honored. Abstracting with credit is permitted.To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of thiswork in other works requires prior specific permission and/or a fee. Permissions may be requested fromPublications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)869-0481, or [email protected]© 2014 ACM 1049-3301/2014/03-ART14 $15.00

DOI: http://dx.doi.org/10.1145/2567921

ACM Transactions on Modeling and Computer Simulation, Vol. 24, No. 3, Article 14, Publication date: March 2014.

http://dx.doi.org/10.1145/2567921

http://dx.doi.org/10.1145/2567921

14:2 C. Healey et al.

particularly interested in fully sequential procedures implementing the IZ approachas introduced in, for example, Paulson [1964], Hartmann [1991], and Kim and Nelson[2001]. Such procedures guarantee a desired level of PCS, while accomplishingefficiency through screening performed after each stage of sampling, which can involveas little as one additional data point for every system in contention.

There has been recent interest in constrained R&S procedures for selecting the bestsystem that also satisfies constraints on one or more secondary performance measures.Santner and Tamhane [1984] introduced a procedure to select the best system under aconstraint on variance. Morrice and Butler [2006] utilized multiple attribute utility the-ory to develop a two-stage procedure to select the best system with constraints, whereasFrazier and Kazachkov [2011] model this problem in a Bayesian context and Frazieret al. [2011] allowed for the use of common random numbers (CRN). Pujowidiantoet al. [2009], Hunter et al. [2011], and Hunter and Pasupathy [2013] proposed otherprocedures for constrained R&S under multiple constraints within the OCBA approach,and Kabirian and Olafsson [2009] suggested a heuristic IZ approach for the selectionof the best system while considering the probability that several stochastic constraintsare feasible. In addition, a Pareto approach for addressing multiple performance mea-sures is proposed by Lee et al. [2010].

This article is most closely related to the work of Andradottir and Kim [2010] andHealey et al. [2013]. Andradottir and Kim [2010] introduced a fully sequential, IZframework for constrained R&S consisting of two phases: feasibility check and selectionof the best (comparison). These phases may be addressed either sequentially (the fea-sibility of each system is determined before comparison begins) or simultaneously (thefeasibility check and comparison screening occur simultaneously after each additionalsample). Andradottir and Kim [2010] and Healey et al. [2013] proposed and analyzedseveral fully sequential IZ R&S procedures within this framework for independent sys-tems with one constraint. For procedures that try to minimize switching due to the pos-sibly high cost of stopping and restarting complex simulations, see Healey et al. [2014].

In this article, we elaborate on the framework of Andradottir and Kim [2010] andextend fully sequential procedures to select the best system under any number of con-straints and possible correlation across systems. This is a substantial extension ofprevious research that has only addressed independent systems and one constraint.We show how to bring valid feasibility check techniques for multiple constraints[Szechtman and Yucesan 2008; Batur and Kim 2010] and valid comparison techniquestogether to achieve statistically valid R&S procedures for multiple constraints.

R&S procedures should not allow the handling of multiple constraints to shift empha-sis unduly toward feasibility verification. Thus, we consider how error should be allo-cated between the feasibility check and comparison phases of the procedures. With thesupport of experimental and approximate asymptotic results, we devise general, robust,and efficient error allocation rules as functions of the number of constraints for bothsimultaneously running and sequentially running constrained selection procedures.

One topic of interest is the impact of multiple constraints on computational effi-ciency. Valid procedures for constrained R&S may require more observations to selectthe best feasible system than standard R&S due to a lengthy feasibility verificationand the splitting of error between feasibility check and comparison. But within con-strained R&S, there has been no study that we know of concerning the difficulty ofsatisfying multiple constraints within valid procedures (see, however, Kabirian andOlafsson [2009] for related results). For example, what is the difference in the numberof samples needed to find the best feasible system under one constraint versus fiveconstraints? We conduct an experimental study and show how many more (or less,surprisingly) observations a constrained R&S procedure can require when consideringmultiple constraints while still guaranteeing a nominal PCS.


Selection Procedures for Simulations with Multiple Constraints 14:3

Our extension for correlation across systems is also significant, as it allows for theuse of CRN. CRN have been shown to reduce the number of required samples in R&Sprocedures (e.g., see Nelson and Matejcik [1995], Chick and Inoue [2001], and Kim andNelson [2001]). We investigate when and how CRN should be used within constrainedR&S procedures to reduce the observations necessary to make valid selection of thebest feasible system.

In summary, the main contributions of the article involve presenting R&S proceduresfor multiple constraints, addressing the validity of these procedures, and resolvingissues associated with the efficient implementation of the procedures, including errorallocation, cost of additional constraints, and effects of CRN.

The article is organized as follows. Section 2 provides necessary background material.In Section 3, we present our procedures for multiple constraints and prove their validityin Section 4. In Section 5, we address efficient implementation, specifically appropriateerror allocation and the use of CRN. We analyze experimental results in Section 6 andconclude our article in Section 7. Finally, the proofs of our results are provided in theOnline Appendix.

2. BACKGROUND

This section details the background needed to formulate and analyze the general con-strained R&S problem and procedures for solving it. In Section 2.1, we describe theproblem formulation and notation. Section 2.2 provides necessary assumptions. Wealso include two feasibility check procedures for multiple constraints in Section 2.3that will be implemented in our general R&S procedures.

2.1. Formulation and Notation

Constrained R&S attempts to select the best system with respect to the mean of aprimary performance measure in the presence of constraints on one or more secondaryperformance measures. Let (Xin, Yi1n, . . . , Yisn) be the nth observation of the ith systemfor the primary performance measure and s secondary performance measures. The setof all possible systems is denoted � = {1, . . . , k}.

We let xi = E[Xin] and yi� = E[Yi�n] be the expected values of the primary andsecondary performance measures for each system i ∈ � and constraint � = 1, . . . , s.Our objective is to select the system with the best primary performance measure thatalso satisfies all constraints, q�:

arg maxi∈�

xi

s.t. yi� ≤ q� for all � = 1, . . . , s.

This objective is accomplished through an IZ approach, extended to include both thecomparison of primary performance measures and the feasibility check of multiplesecondary performance measures.

For the primary performance measure, we let δ, the IZ parameter, be the smallestdistance that we consider significant. We are essentially indifferent among the feasiblesystems whose primary performance measures are within δ of each other. If xi is foundto be greater than xj , then we say that system i is superior to system j (or equivalentlysystem j is inferior to system i).

We also employ the IZ approach for each of the secondary performance measures,but in this case, the smallest significant distance is ε�, the tolerance level associatedwith constraint �. Any system with yi� ≤ q� − ε� for all � = 1, . . . , s is considereddesirable. The set of all desirable systems is denoted �D. Systems that have at leastone mean secondary performance measure greater than or equal to q� + ε� for some �are unacceptable and infeasible, placing them in the set �U . Systems that fall within



the tolerance level of q� for some �, so that q� − ε� < yi� < q� + ε�, and below thetolerance level for the remaining constraints are acceptable and are placed in the set�A. The goal is to identify a desirable or acceptable system whose primary performancemeasure is no worse than an IZ away from that of the best desirable system, whichdefines a correct selection (CS) in our problem.

To ensure validity of the procedures, some additional notation must be described:

n0 = the first stage sample size;S2

Xij= the sample variance of {Xi1 − Xj1, . . . , Xin0 − Xjn0};

S2Yi�

= the sample variance of {Yi�1, . . . , Yi�n0} (the �th constraint of system i);ε = (ε1, ε2, . . . , εs)T , ε� ∈ R

+; q = (q1, q2, . . . , qs)T , q� ∈ R; a= (a1, a2, . . . , as)T , a� ∈ R+;

Y in = (Yi1n, Yi2n, . . . , Yisn)T ; qa = aT q; εa = aT ε; Y ain = aT Y in;

S2Y a

i= the sample variance of {Y a

i1, . . . , Y ain0

};R(r; γ, ζ, κ) = max{0,

ζκ

2γ− γ

2 r}, for γ, ζ, κ ∈ R+ and γ �= 0;

b = the identity of the best desirable system;CS = the event that a desirable or acceptable system whose primary performance

measure is no worse than an IZ away from that of the best desirable system isselected; if �D = ∅ and �A �= ∅, the selection of any system in �A or eliminationof all systems would be correct; if �D ∪ �A = ∅, all systems should be eliminated;

CSi = the event that a correct selection is made in comparison between inferiorsystem i and the best desirable system, b, given xb ≥ xi + δ for all i ∈ �D ∪ �A;

CDi = the event that a correct feasibility decision is made on system i ∈ � (wheni ∈ �A, either a feasible or infeasible decision is correct);

α = the overall nominal error for a procedure under consideration;α1 = the overall nominal error for feasibility check in a sequentially running

procedure;α2 = the overall nominal error for comparison in a sequentially running procedure;β1 = the nominal error of feasibility check for one performance measure of one

system;β2 = the nominal error of comparison between two systems; ande = sβ1/β2.

2.2. Assumptions for Validity

In this section, we provide assumptions about the data, systems, and procedures. Eachof our results will require a specified subset of these assumptions.

ASSUMPTION 1. For each i = 1, 2, . . . , k,⎡⎢⎢⎣

XinYi1n

...Yisn

⎤⎥⎥⎦ iid∼ Ns+1

⎛⎜⎜⎝

⎡⎢⎢⎣

xiyi1...

yis

⎤⎥⎥⎦ , �i

⎞⎟⎟⎠ n = 1, 2, . . . ,

whereiid∼ denotes independent and identically distributed, Ns+1 denotes (s + 1)-

dimensional multivariate normal, and �i is the (s + 1) × (s + 1) covariance matrixof the vector (Xin, Yi1n, . . . , Yisn).

Normally distributed data is a common, not particularly restrictive, assumption.Law and Kelton [2000] explain how normality can be achieved through within-replications averages or batch means. Commonly, primary and secondary performancemeasures will be correlated. Moreover, if CRN are used to simulate different systems,



(Xin, Yi1n, . . . , Yisn) and (Xjn, Yj1n, . . . , Yjsn) will typically be correlated. Therefore, weallow correlation across systems and performance measures.

ASSUMPTION 2. For any i ∈ �D ∪ �A with i �= b, xb ≥ xi + δ.

This assumption ensures that all systems that could be (correctly) deemed feasibleare inferior to b by at least one IZ. This is a commonly used assumption in the R&Sliterature and holds whenever b is unique, �A only contains systems that are inferior tob, and δ is sufficiently small. Under Assumption 2, CS implies the selection of system b.

ASSUMPTION 3. If the systems are simulated independently, the feasibility check phaseguarantees Pr{∩i∈�′CDi} ≥ (1 − sβ1)t for any 1 ≤ t ≤ k and any subset �′ ⊆ � withcardinality t.

ASSUMPTION 4. If the systems are simulated under CRN, the feasibility check phaseguarantees Pr{∩i∈�′CDi} ≥ (1 − tsβ1) for any 1 ≤ t ≤ k and any subset �′ ⊆ � withcardinality t.

We assume that the feasibility check procedure can correctly determine the feasi-bility of any number of systems with s constraints with a certain probability. Systemssimulated under CRN require different bounds than independently simulated systems.Additional discussion on when Assumptions 3 and 4 hold is provided in Section 2.3.

ASSUMPTION 5. If the systems are simulated independently, the comparison phaseguarantees Pr{∩i∈�′CSi} ≥ (1 − β2)t for any 1 ≤ t ≤ k − 1 and any subset �′ of {i ∈{1, . . . , k} : xi ≤ xb − δ} with cardinality t.

ASSUMPTION 6. If the systems are simulated under CRN, the comparison phaseguarantees Pr{∩i∈�′CSi} ≥ (1 − tβ2) for any 1 ≤ t ≤ k − 1 and any subset �′ of{i ∈ {1, . . . , k} : xi ≤ xb − δ} with cardinality t.

Given that we start with a set of systems inferior to system b, we require thatpairwise comparison of systems in this set concludes with a selection of b as the bestwith a certain probability. Again, the use of CRN requires different bounds than whenconsidering independent systems. Several IZ-based comparison procedures, such asKN of Kim and Nelson [2001], satisfy Assumption 5, but not all procedures satisfyAssumption 6.

ASSUMPTION 7. Observation n of system i (i.e., Xin and Yi�n for � = 1, . . . , s) should notdepend on the order the systems are sampled.

This assumption is critical to the proof of any procedure that implements dormancy(Healey et al. [2013]; see also Section 3.3). It assures that procedures with and withoutdormancy produce identical results, and they can be accomplished by assigning eachsystem its own random number stream.

2.3. Feasibility Check Procedures for Multiple Constraints

For the feasibility check phase, we feature the fully sequential procedures, F IB and F I

A,of Batur and Kim [2010]. F I

B is a fully sequential feasibility check procedure for one ormore constraints whose validity is established through the use of Bonferroni bounds.F I

A features an artificial constraint, obtained by aggregation (or linear combination)of all secondary performance measures and their constrained levels. These proceduresshare a common setup, with additional steps to accommodate the aggregation in F I

A.To account for every system’s status, we utilize a set M of systems with undeterminedfeasibility, a set of systems deemed feasible F, a set Ki that tracks the individualperformance measures that have been deemed feasible for system i, for all i ∈ �, and



a set A containing all systems whose feasibility according to the aggregate constrainthas not been determined. We also denote the cardinality of a set as | · |.

Section 2.3.1 provides a detailed implementation of F IB. Section 2.3.2 features a

similar description of F IA and a proof that the procedure satisfies Assumption 3.

2.3.1. Basic Feasibility Check for Multiple Constraints: F IB. This approach involves sequential

screening on every constrained performance measure. If a constraint is found to beviolated, the system is declared infeasible. A system is declared feasible only if allconstraints have been deemed feasible. Batur and Kim [2010] proved that with β1 =α/(ks) for correlated systems and β1 = [1 − (1 − α)1/k]/s for independent systems, F I

Bguarantees that the event �D ⊂ F ⊂ �D ∪ �A occurs with probability at least 1 − αwhen Assumption 1 holds. It also satisfies Assumptions 3 and 4 in this situation, aresult of the proofs of Lemma 1 and Corollary 1 of Batur and Kim [2010]. We presentF I

B where the screening parameter is set to c = 1.

Procedure [F IB]

Setup: Select a first-stage sample size, n0 ≥ 2. Choose ε� and q� for � = 1, 2, . . . , s.Let η1 = 1

2 ((2β1)−2/(n0−1) − 1) and h21 = 2η1(n0 − 1).

Initialization: Obtain n0 observations from each constrained performance measure� = 1, 2, . . . , s from every system i = 1, 2, . . . , k. For all i and �, compute S2

Yi�. Set

the observation counter ri = n0 and Ki = ∅ for i = 1, 2, . . . , k. Let M contain allsystems and F = ∅.

Feasibility Check: For all i ∈ M and any � /∈ Ki, if∑ri

n=1(Yi�n − q�) ≥R(ri; ε�, h2

1, S2Yi�

), then remove i from M. Else if∑ri

n=1(Yi�n−q�) ≤ −R(ri; ε�, h21, S2

Yi�),

then add � to Ki. If |Ki| = s, remove i from M and add i to F.Stopping Rule: If |M| = 0, then stop and return the set F as feasible systems.

Otherwise, for all systems i ∈ M, take one additional observation Y i,ri+1 and setri = ri + 1. Then go to Feasibility Check.

2.3.2. Accelerated Feasibility Check for Multiple Constraints: F IA . If two or more constrained

performance measures are involved in the feasibility check, then it is possible to ac-celerate the feasibility determination for systems that are infeasible for multiple con-straints. In particular, Batur and Kim [2010] introduce an artificial, aggregate con-straint to the feasibility check. This aggregate constraint adds some complexity butcan quickly eliminate systems that violate multiple constraints. The new constraint isa linear function of all secondary performance measure samples, with positive weightsa1, a2, . . . , as for each constraint 1, 2, . . . , s, respectively, and can only be used to de-clare systems infeasible. Batur and Kim [2010] suggest the values a� = ∏s

ν=1,ν �=� εν ,for � = 1, 2, . . . , s, to minimize the area where systems may be unacceptable for theoriginal constraints and acceptable for the aggregate constraint.

Batur and Kim [2010] show that when β1 = α/(k(s + 1)), F IA for correlated systems

guarantees that the event �D ⊂ F ⊂ �D ∪ �A occurs with probability at least 1 − αwhen Assumption 1 holds. The proof of Lemma 2 of Batur and Kim [2010] shows thatF I

A satisfies Assumption 4 in this situation. At the end of the section, we strengthenCorollary 2 of Batur and Kim [2010] and prove that F I

A satisfies Assumption 3 for inde-pendently simulated systems. Note that Batur and Kim [2010] recommended definingβ1 heuristically, in terms of s instead of s+1 constraints (so that β1 = α/(ks)), to ensurethat F I

A always performs more efficiently than F IB, while showing only a small, practi-

cally insignificant loss in PCS. Our experiments will feature this aggressive definitionof β1. We present an instance of F I

A when the screening parameter is set to c = 1.



Procedure [F IA]

Setup: Same as in F IB.

Initialization: Same as in F IB, except we also compute S2

Y ai

for all i and let A = �.Feasibility Check: Same as in F I

B except for the following addition: if i ∈ M ∩ Aand

∑rin=1(Y a

in − qa) ≥ R(ri; εa, h21, S2

Y ai), then remove i from M and A. For i ∈ M ∩ A

with∑ri

n=1(Y ain − qa) ≤ −R(ri; εa, h2

1, S2Y a

i), remove i from A.

Stopping Rule: Same as in F IB, except for the following addition: if taking an

additional observation from system i ∈ M ∩ A, calculate Y ai,ri+1.

Let CDi� and ICDi� denote the events of a correct and an incorrect decision ofthe feasibility of constraint � of system i, respectively. Similarly, let CDa

i and ICDai

denote the events of a correct and an incorrect decision of the feasibility of the ag-gregate constraint of system i, respectively (as the aggregate constraint is only usedto make infeasibility decisions, CDa

i is the event that system i is not declared in-feasible by the aggregate constraint for i ∈ �D and is a probability one event fori ∈ �A ∪ �U ). Suppose that Assumption 1 holds. Andradottir and Kim [2010] haveshown that Pr{CDi�} = 1 − Pr{ICDi�} ≥ 1 − β1 and Pr{CDa

i } = 1 − Pr{ICDai } ≥ 1 − β1.

Batur and Kim [2010] show that if systems are simulated independently and β1 sat-isfies (1 − sβ1)k + (1 − β1)k = 2 − α, then Pr{∩i∈SCDi} ≥ 1 − α (see Corollary 2 andRemark 1 in Batur and Kim [2010]). We now strengthen this result and show that ifβ1 = (1− (1−α)1/k)/(s+1), Assumption 3 is satisfied. The proof of Theorem 2.1 is givenin the Online Appendix.

THEOREM 2.1. If Assumption 1 holds with independently simulated systems and 0 <

β1 < 1s+1 is chosen such that (1−(s+1)β1)k = 1−α, then F I

A satisfies Pr{∩i∈�′CDi} ≥ 1−α

for any 1 ≤ t ≤ k and any subset �′ ⊆ � with cardinality t.

3. GENERAL CONSTRAINED R&S PROCEDURES

In this section, we present three procedures for constrained R&S with multiple con-straints. The procedures generalize approaches of Andradottir and Kim [2010] and[Healey et al. 2013] that were originally formulated to compare independent systemswith a single constrained performance measure. Our generalized algorithms incorpo-rate a fully sequential feasibility check for any number of constraints, and two of themallow for the valid incorporation of CRN.

In Section 3.1, we describe a sequentially running procedure. Sections 3.2 and 3.3 fea-ture simultaneously running procedures. In all of the procedures, F(M) refers to the setof systems that are under consideration for selection (have undetermined feasibility).

3.1. A Sequentially Running Procedure: HAKIn this section, we extend the AK procedure of Andradottir and Kim [2010]. Thisprocedure performs feasibility check and comparison in sequence, first completingfeasibility check for all systems, then proceeding to select the best out of the survivingsystems. Since feasibility check may be completed at different sample sizes for eachsystem, the SSM procedure of Pichitlamken et al. [2006] is used to perform comparison.

Although the AK procedure is heuristic, Andradottir and Kim [2010] show thatany degradation in PCS is very limited and its performance can be competitive. Inparticular, this procedure can be very efficient if feasibility is quickly determined andseveral infeasible systems are eliminated, and hence it is a useful algorithm to extendto multiple constraints. Andradottir and Kim [2010] present a similar, less efficient



sequentially running procedure that utilizes restarting to make a valid selection ofthe best feasible system. This procedure can also be extended to include multipleconstraints for independent and correlated systems, but the details fall outside thescope of this article. We will discuss how to choose the confidence levels for feasibilitycheck 1 − α1 and comparison 1 − α2 in Section 5.1.

Procedure [HAK]Setup: Select the overall confidence level 1/k ≤ 1 − α < 1 and then choose

the confidence levels for feasibility check 1 − α1 and comparison 1 − α2, whereα1 + α2 = α. Use the Setup of the chosen feasibility check procedure, specifyingβ1 = (1 − (1 − α1)1/k)/s for independent systems and β1 = α1/(ks) for correlatedsystems.

Initialization: Use the Initialization of the chosen feasibility check procedure. Inaddition, obtain n0 observations Xin from each system i = 1, 2, . . . , k. For all i andj �= i, compute S2

Xij.

Feasibility Check: Same as in the chosen feasibility check procedure.Feasibility Stopping Rule: Same as in the chosen feasibility check procedure. In

addition, for any system i receiving an additional sample, take Xi,ri+1.Setup for Comparison: If |F| = 0, conclude that there exist no feasible systems.

If |F| = 1, then stop and select the system whose index is in F as the best.Otherwise, select δ > 0. Let η2 = 1

2 ((2β2)−2/(n0−1) − 1), where β2 = α2/(|F| − 1) andh2

2 = 2η2(n0 − 1). Set r = n0, where r denotes an index used in pairwise systemcomparisons.

Comparison: Considering any i, j ∈ F such that i �= j, if rri

∑rin=1 Xin ≤ r

rj

∑rj

n=1 Xjn−R(r; δ, h2

2, S2Xij

), then eliminate i from F.Comparison Stopping Rule: If |F| = 1, then stop and select the system whose

index is in F as the best. Otherwise, for each system i ∈ F with ri = r, take oneadditional observation Xi,ri+1, set ri = ri +1 and r = r+1. Then go to Comparison.

3.2. A Simultaneously Running Procedure: HAK+Andradottir and Kim [2010] introduced the AK+ procedure that performs feasibilitycheck and comparison simultaneously after each additional stage of sampling. Systemsare eliminated from contention after being found either infeasible or inferior to afeasible system. We now present our extension HAK+. This simultaneously runningapproach will show an improvement over HAK in configurations where feasibilitycheck is more difficult relative to comparison. In Section 4, we will prove HAK+ to bevalid for independently simulated systems and correlated systems. The approach forchoosing valid values of β1 and β2 is different for independently simulated or correlatedsystems, as we will detail further in Section 4.

Procedure [HAK+]Setup: Select the overall confidence level 1/k ≤ 1 − α < 1 and δ. Use the Setup of

the chosen feasibility procedure. Let η2 = 12 ((2β2)−2/(n0−1) − 1).

Initialization: Use the Initialization of the chosen feasibility procedure. In addi-tion, let SSi be the set of systems superior to system i in terms of xi and initializeSSi to be the empty set for all i. Let h2

2 = 2η2(n0 − 1). Obtain n0 observationsXin from each system i = 1, 2, . . . , k. For all i and j �= i, compute S2

Xij. Set the

observation counter r = n0.



Feasibility Check: Same as in the chosen feasibility procedure. If found feasible,move i from M to F, and for all j ∈ (M ∪ F) with i ∈ SSj , eliminate j from M or Fand delete SSj . If found infeasible, eliminate i from M and any existing SSj anddelete SSi.

Comparison: For each i, j ∈ (M ∪ F) such that j �= i, j /∈ SSi, i /∈ SSj , and∑rn=1 Xin ≤ ∑r

n=1 Xjn − R(r; δ, h22, SX2

i j), if j ∈ F, then eliminate i from M or F,

delete SSi, and remove i from any SSj ′ ; otherwise, if j /∈ F, then add index j toSSi.

Stopping Rule: If |M| = 0 and |F| = 1, then stop and select the system whoseindex is in F as the best. If |M| = 0 and |F| = 0, then stop and report that thereis no feasible system. Otherwise, for all systems i ∈ M ∪ F with either i ∈ M or|SSi| < |M|, take one additional observation (Xi,ri+1, Y i,ri+1), set r = r + 1, andthen ri = r. Then go to Feasibility Check.

3.3. A Simultaneously Running Procedure with Dormancy: MDR

The dormant with recall procedure, DR, of Healey et al. [2013] is a more aggressivesimultaneous R&S procedure for a single constraint. Like AK+, it can safely eliminatea system if it is found infeasible or inferior to another feasible system. The dormancyframework halts sampling from all systems found inferior to any system in contentionwith feasibility yet undetermined. This allows the procedure to avoid sampling frominferior systems and to compare and test for feasibility of the most promising systemsfirst. A dormant system returns to contention if its superior system is eliminated.

The starting and stopping of sampling for dormant systems creates uneven samplesizes, a difficulty overcome inDR by storing past observations. This allows the procedureto compare systems at an equal number of samples via the KN procedure of Kimand Nelson [2001]. In this section, we extend the statistically valid DR procedure tomultiple constraints, resulting in the MDR procedure. Healey et al. [2013] also presentthe heuristic dormant with averages and dormant with catch-up procedures, that canbe extended in a similar fashion, but this falls outside the scope of the current article.As for HAK+, we will prove MDR to be valid for both independent and correlatedsystems. Valid choices of β1 and β2 are discussed in Section 4.

Procedure [MDR]Setup: Same as in HAK+.Initialization: Same as in HAK+, except we also set D = ∅, where D is the set of

dormant systems.Feasibility Check: Same as in the chosen feasibility check procedure, except fea-

sibility is only checked for i ∈ M\D with ri = r. If i is feasible, move i from Mto F. For all j ∈ M ∪ F with i ∈ SSj , eliminate j from M or F, delete SSj , andremove j from D, if applicable. Else, if i is found infeasible, eliminate i from Mand any existing SSj and delete SSi. If i ∈ SSj and j ∈ D, remove j from D andlet r = min{r, rj}.

Comparison: For each i, j ∈ (M ∪ F)\D such that j �= i, ri or rj is equal to r, and∑rn=1 Xin ≤ ∑r

n=1 Xjn − R(r; δ, h22, S2

Xij), if j ∈ F, then eliminate i from M or F,

delete SSi, and for all j ′ ∈ D with i ∈ SSj ′ , eliminate i from SSj ′ , remove j ′ fromD, and let r = min{r, rj ′ }; otherwise, if j /∈ F, then add index j to SSi and i to D.(Note that each system in D has been declared inferior to exactly one system inM.)

Stopping Rule: If |M| = 0 and |F| = 1, then stop and select the system whoseindex is in F as the best. If |M| = 0 and |F| = 0, then stop and report that thereis no feasible system. Otherwise, for all systems i ∈ (M ∪ F)\D such that ri = r,



take one additional observation (Xi,ri+1, Y i,ri+1) and set ri = ri + 1. Set r = r + 1.Then go to Feasibility Check.

4. VALIDITY OF ALGORITHMS

We present HAK+ and MDR as statistically valid algorithms for general constrainedR&S of independent or correlated systems in Sections 4.1 and 4.2, respectively. Theproofs are presented while implementing F I

B as the feasibility check procedure.

Remark 4.1. The use of F IA only requires an additional constraint (i.e., s + 1 con-

straints rather than s constraints), as is clear from Theorem 2.1 for independentlysimulated systems and from Lemma 2 of Batur and Kim [2010] for correlated systems.Thus, Lemmas 4.2 and 4.6 and Theorems 4.4, 4.5, 4.8, and 4.9 hold for F I

A, as long as sis replaced by s + 1 in the statement of these results.

4.1. Validity of HAK+ and MDR for Independent Systems

To prove the validity of HAK+ and MDR, we begin with the following lemma.

LEMMA 4.2. Under Assumptions 2, 3, and 5, a simultaneously running procedure forindependently simulated systems guarantees

Pr{CS} ≥ (1 − sβ1) j + (1 − sβ1) + (1 − β2)k− j−1 − 2 (1)

when |�U | = j < k and Pr{CS} ≥ (1 − sβ1)k when |�U | = k.

Lemma 4.2 does not specify how to choose β1 and β2. There are many valid values ofβ1 and β2 that cause the Right-Hand Side (RHS) of (1) to be greater than 1 − α, but wewould prefer the largest possible values for β1 and β2 to make our procedures efficient.Since |�U | may not be known at the time of initialization, we must also address howthe RHS of (1) changes in j.

Remark 4.3. The lower bound (1 − sβ1)k on Pr{CS} in Lemma 4.2 when |�U | = ksatisfies (1−sβ1)k = (1−sβ1)k−1 −(1−sβ1)k−1sβ1 ≥ (1−sβ1)k−1 −sβ1, and (1−sβ1)k−1 −sβ1is the value of the RHS of (1) when j = k − 1. Therefore, the smallest lower bound onPr{CS} in Lemma 4.2 is always achieved for j < k.

THEOREM 4.4. Under Assumptions 1 and 2 with independently simulated systems,HAK+ implemented with F I

B guarantees Pr{CS} ≥ 1 − α when

(1 − sβ1) j + (1 − sβ1) + (1 − β2)k− j−1 − 2 ≥ 1 − α for all j ∈ {0, 1, . . . , k − 1}. (2)

THEOREM 4.5. Under Assumptions 1, 2, and 7 with independently simulated systems,MDR implemented with F I

B guarantees Pr{CS} ≥ 1 − α when (2) is satisfied.

We provide one method for choosing the values β1 and β2. The key to this approachis the choice of e = sβ1/β2, the ratio of error for a complete feasibility check for onesystem to the error of a comparison between two systems. For any choice of e, we cansimplify the RHS of (1) and find a valid value of β2. In particular, (1) now yields

Pr{CS} ≥ (1 − eβ2) j + (1 − eβ2) + (1 − β2)k− j−1 − 2, (3)

for j = |�U | < k. Since j = |�U | is unknown, we must find values of β2 ∈ [0, min{1, 1/e}]such that the RHS of (3) is no smaller than 1 − α for all j ∈ {0, 1, . . . , k − 1}. Note thatfor a fixed value of j ∈ {0, 1, . . . , k−1}, the RHS of (3) monotonically decreases from 1 tobelow 0 as β2 increases from 0 to min{1, 1/e}. Thus, for any value of j ∈ {0, 1, . . . , k−1},there exists a value of β2 such that the RHS of (3) is equal to 1 −α, which can be solvednumerically.



Given that for all j ∈ {0, 1, . . . , k − 1}, a value of β2 can be found to set the RHS of(3) equal to 1 − α, one can iterate through all values of j ∈ {0, 1, . . . , k − 1} to find theminimum β2. The minimum β2 would ensure that the lower bound on Pr{CS} exceeds1 − α for all j ∈ {0, 1, . . . , k}, then β1 is calculated via the ratio e. This is one approachto supply values of β1 and β2 that satisfy Theorems 4.4 and 4.5. The choice of theparameter e will be addressed in Section 5.1.

We note that if e = 1, then sβ1 = β2 and the value of j ∈ [0, k− 1] that minimizes theRHS of (3) is j∗ = (k − 1)/2. Therefore, a value of β2 that guarantees the nominal PCScan be found by solving the equation β2 + 2[1 − (1 − β2)(k−1)/2] = α (the left-hand sideof this equation increases from 0 to 3 as β2 increases from 0 to 1, so there is always asolution). Figure A.1 in the Online Appendix shows how the RHS of (3) changes as afunction of j and e when k = 25 and β2 = 0.002.

4.2. Validity of HAK+ and MDR for Correlated Systems

Correlation of data across systems requires a slightly different proof approach. Al-though the feasibility check procedures of Batur and Kim [2010] guarantee a desiredprobability of correct feasibility decision under correlation, the same is not true ofall comparison techniques under correlation. Fortunately, the underlying comparisonprocedure of HAK+ and MDR is KN of Kim and Nelson [2001], which is valid undercorrelation with certain parameter adjustments.

We present a lemma that will help prove the validity of HAK+ and MDR.

LEMMA 4.6. Under Assumptions 2, 4, and 6, a simultaneous procedure for correlatedsystems under s constraints guarantees

Pr{CS} ≥ 1 − ( j + 1)sβ1 − (k − j − 1)β2 (4)

when |�U | = j < k and Pr{CS} ≥ 1 − ksβ1 when |�U | = k.

Remark 4.7. The lower bound 1−ksβ1 on Pr{CS} in Lemma 4.6 when |�U | = k is thevalue of the RHS of (4) when j = k − 1. Therefore, the smallest lower bound on Pr{CS}in Lemma 4.6 is always achieved for j < k.

THEOREM 4.8. Under Assumptions 1 and 2 with correlated systems such that(X1n, X2n, . . . , Xkn) are iid multivariate normal with a positive definite covariance matrix,HAK+ implemented with F I

B guarantees Pr{CS} ≥ 1 − α when

1 − ( j + 1)sβ1 − (k − j − 1)β2 ≥ 1 − α for all j ∈ {0, 1, . . . , k − 1}. (5)

THEOREM 4.9. Under Assumptions 1, 2, and 7 with correlated systems such that(X1n, X2n, . . . , Xkn) are iid multivariate normal with a positive definite covariance matrix,MDR implemented with F I

B guarantees Pr{CS} ≥ 1 − α when (5) is satisfied.

Since j = |�U | may be any integer between 0 and k, we must ensure that Pr{CS} ≥1−α for any j ∈ {0, 1, . . . , k−1}. Recall that e = sβ1/β2. We assume that e is given. Dueto the linearity of 1 − [( j + 1)e + (k − j − 1)]β2 in j, one can see easily how the value,j∗ ∈ {0, 1, . . . , k − 1}, that minimizes the RHS of (4) depends on e:

j∗ ={

k − 1, if e ≥ 1,0, if e < 1.

Note that for e = 1, the RHS of (4) does not depend on j ∈ {0, 1, . . . , k − 1}. Thus, toachieve Pr{CS} ≥ 1 − α for all values of j, a simultaneous procedure would require



the following:

β1 ={

α/(sk), if e ≥ 1,eα/(se + s(k − 1)), if e < 1,

and β2 ={

α/(ek), if e ≥ 1,α/(e + (k − 1)), if e < 1.

This is one approach to set values of β1 and β2 to satisfy Theorems 4.8 and 4.9.

5. EFFICIENT DESIGN OF PROCEDURES FOR CONSTRAINED R&S

In this section, we consider issues that directly affect the efficiency of HAK, HAK+,and MDR, namely the choice of error parameters and the use of CRN. These issues areaddressed in Sections 5.1 and 5.2, respectively.

5.1. Error Allocation

The choice of parameters that govern the allowable error in the comparison and fea-sibility check phases can be critical to efficiency. For sequential procedures, the userchooses the total amount of error α1 and α2 for the feasibility check and comparisonphases, respectively. For simultaneous procedures, β1 and β2 equal the error of in-dividual feasibility checks and comparisons, respectively. In this section, we provideexperimental results that suggest efficient choices for these parameters.

If the relative difficulties of feasibility check and comparison were known, someefficiency could be gained by tuning the error allocation correctly. However, since detailsabout the means and variances of the primary and secondary performance measuresare often not known, robust strategies for error allocation are useful.

For our analysis of error allocation, we consider two procedures, HAK and HAK+,as representatives of sequential and simultaneous constrained R&S procedures, re-spectively. MDR is an application of the dormancy framework to HAK+, so we expectthese two procedures to produce similar results. Andradottir and Kim [2010] suggestan allocation for the procedures under one constraint, namely α1 = α2 = α/2 for se-quentially running procedures and β1 = β2 for simultaneously running procedures.However, when s > 1, it is unclear how this strategy should be extended. In particular,two reasonable choices are equal error allocation between feasibility check and compar-ison and equal error allocation for each (primary or secondary) performance measuretested (giving more error to the feasibility check phase to handle multiple constraints).

We use the F IB procedure for feasibility check, because it is a simple and valid ap-

proach. We discuss the advantages of F IA in Section 6, but we do not want to add its

complexity to our analysis. When representing the use of feasibility check procedureF I

B or F IA within a constrained R&S procedure, for example HAK, we will denote the

procedure as HAK(B) or HAK(A), respectively. The other combinations of proceduresstudied in this article (i.e., HAK+ and MDR with F I

B and F IA) are similarly denoted.

Section 5.1.1 details the setup featured in all of our numerical experiments.Section 5.1.2 provides the study of error allocation within sequential procedures. Sec-tion 5.1.3 investigates error allocation within simultaneous procedures. Section 5.1.4features derivations of the asymptotic total number of observations required for HAKand HAK+ under different error allocations, validating the experimental results.

5.1.1. Experimental Setup. We tested the procedures under differing ratios of errors(α1/α2 or β1/β2) for various configurations of means and variances. These mean andvariance configurations are consistent with the experimental setups of similar R&Sstudies—for example, Kim and Nelson [2001] and Andradottir and Kim [2010] amongothers. We test the procedures using 10,000 macroreplications.

We set n0 = 20 and δ = ε� = 1/√

20 (the sample standard deviation of the initialaverage when samples have a variance of 1) for all � = 1, 2, . . . , s. We set a nominalPCS of 1 − α = 0.95. We let �A = ∅ because Andradottir and Kim [2010] show that the



presence of acceptable systems does not significantly affect the experimental results.Finally, we set the constraint levels, q�, to zero.

We introduce an additional consideration for multiple constraints, specifically thenumber of violated constraints v for an infeasible system. The value of v is crucialto how quickly a feasibility check completes. For our tests, we will feature a varyingnumber of constraints s and v ∈ {1, s}, with v = 1 implying a hard feasibility check andv = s creating an easier feasibility check.

We now describe our mean configurations. The following monotone increasing means(MIM) configuration emulates a common situation when many systems are eitherinfeasible or inferior: xi = E[Xij] = (i − 1)δ, i = 1, 2, . . . , k, and

yi� = E[Yi�j] ={ −(b − i + 1)ε�, i = 1, 2, . . . , b,

(i − b)ε�, i = b + 1, . . . , k, and � = 1, 2, . . . , v,−(i − b)ε�, i = b + 1, . . . , k, and � = v + 1, v + 2, . . . , s,

where b is the number of feasible systems. This means that system k is best withrespect to the primary measurement, but system b is the best feasible system.

The difficult means (DM) configuration attempts to test the validity of the proceduresin a challenging setup. In this configuration, there are b − 1 feasible systems that areonly slightly inferior (by an IZ parameter) to the best system, and the remainingsuperior systems are only slightly infeasible (by a tolerance level). More specifically, inthe DM configuration,

xi = E[Xin] ={ 0, i = 1, 2, . . . , b − 1,

δ, i = b,(i − 1)δ, i = b + 1, . . . , k;

yi� = E[Yi�n] ={ −ε�, i = 1, 2, . . . , b,

ε�, i = b + 1, . . . , k and � = 1, 2, . . . , v,−ε� i = b + 1, . . . , k and � = v + 1, v + 2, . . . , s.

We also examine different variance configurations to test the robustness of the pro-cedures when the relative difficulty of feasibility check and comparison varies. Theseconfigurations involve low (L) and high (H) variances σ 2

xiand σ 2

yi�of the primary and

secondary performance measures. For simplicity, all secondary performance measures� = 1, 2, . . . , s are assigned identical variances. High variance results in either σ 2

xi= 5

or σ 2yi�

= 5, whereas low variance causes σ 2xi

= 1 or σ 2yi�

= 1. For our experiments, weconsider three variance configurations: low σ 2

xiand low σ 2

yi�(L/L), high σ 2

xiand low σ 2

yi�

(H/L), and low σ 2xi

and high σ 2yi�

(L/H). Low variances produce decisions quickly for bothfeasibility check and comparison. Experimental results with different variances acrossall performance measures are reported in the Online Appendix.

Practically, correlation across primary and secondary performance measures shouldbe expected, but Andradottir and Kim [2010] show that such correlation does notsignificantly affect the results. Hence, we do not revisit the topic in this article, andwe obtain primary and secondary performance measure samples independently forall systems. For the explicit effect of the correlation between primary and secondaryperformance measures, see Hunter et al. [2011].

Similarly, Batur and Kim [2010] show that correlation across only secondary perfor-mance measures does not largely affect the performance of the feasibility check proce-dure F I

B. However, strong negative correlation across secondary performance measurescan induce faster completion times in F I

A, while strong positive correlation reduces theeffectiveness of the aggregate constraint. We expect that similar conclusions would



Table I. Average Number of Required Observations under the MIM Configuration with k = 101 Systems,s = 2 Constraints, b Feasible Systems, and v = 1 Infeasible Constraints for the HAK(B) Procedure

with the Given Ratio of α1 to α2

L/L Variance Configuration H/L Variance Configuration L/H Variance Configurationα1/α2 b = 10 b = 51 b = 90 b = 10 b = 51 b = 90 b = 10 b = 51 b = 9010 3375 3835 3930 7881 12540 14050 8922 10263 94118 3336 3760 3869 7655 12157 13563 8956 10320 94346 3289 3690 3785 7379 11664 12998 9023 10366 94954 3243 3599 3687 7042 11085 12288 9125 10500 96202 3198 3498 3570 6493 10166 11280 9422 10852 9924

1 3197 3453 3506 6149 9523 10553 9894 11439 104411/2 3259 3483 3515 5939 9109 10037 10620 12287 111891/4 3371 3571 3593 5828 8857 9769 11550 13409 122061/6 3456 3651 3661 5815 8796 9692 12185 14192 128971/8 3528 3724 3719 5810 8788 9651 12696 14785 134331/10 3583 3783 3774 5847 8773 9656 13084 15258 13867

Note: The best allocation is shown in bold, and the recommended allocation is shown in a box.

be found here. Therefore, we do not address this topic and assume that secondaryperformance measure samples are independent of one another.

Additionally, the effects of correlation across systems should be considered. Unlessexpressed explicitly, we will consider independent systems, but CRN will be examinedin Section 6.4.

5.1.2. Error Allocation for Sequential Procedures. We provide two tables addressing errorallocation for HAK(B) in the MIM configuration with k = 101 and v = 1. Table Idisplays the average number of required observations for different error allocationsas we change the number of feasible systems b while holding all other configurationsettings steady. We consider a wide set of allocations, expressed by the ratio of α1 toα2. Table I shows that in the H/L case where comparison is relatively difficult, errorshould be shifted toward α2 for efficient selection. Similarly, in the L/H case where thefeasibility check is relatively difficult, additional error should be shifted toward α1 forbest performance. In all cases, the best error allocation is relatively insensitive to thenumber of feasible systems b. Although the best allocation changes from case to case,the number of required observations tends to be flat around the best allocation, andthe 1:1 allocation always appears to have performance relatively close to the best.

We present Table II, where the number of feasible systems and parameters are fixedbut the number of constraints, s, varies. In this table, the larger numbers of constraintss tend to require more error devoted to feasibility check (α1) for best performance underthe L/L configuration. In the H/L and L/H configurations, it is again advisable to allowmore error for comparison and feasibility check, respectively, but there is no strongeffect of s. As in Table I, we see that a 1:1 ratio is efficient for all cases, especiallyin the L/L variance configuration. Table II also illustrates that the cost of additionalconstraints, measured by the number of required observations, grows sublinearly withrespect to s within each variance configuration and α1/α2 ratio. This is a reasonableconclusion, as the time to complete a total feasibility check is at worst the maximum (asublinear function) of the times to check each of the secondary performance measures.

Our experimental results suggest that for sequential procedures, an allocation rulethat distributes error evenly between the feasibility check and comparison works well.Although it may make sense to focus on a particular configuration in specific settings,one may not know in advance the relative difficulty of feasibility check versus compari-son. The 1:1 rule is fairly robust to differing numbers of constraints, numbers of feasible



Table II. Average Number of Required Observations under the MIM Configuration with k = 101 Systems,s Constraints, b = 51 Feasible Systems, and v = 1 Infeasible Constraints for the HAK(B) Procedure

with the Given Ratio of α1 to α2

L/L Variance Configuration H/L Variance Configuration L/H Variance Configurationα1/α2 s = 1 s = 2 s = 4 s = 8 s = 1 s = 2 s = 4 s = 8 s = 1 s = 2 s = 4 s = 810 3760 3835 3941 4110 12496 12540 12617 12710 8303 10263 12567 151698 3688 3760 3877 4075 12116 12157 12251 12243 8329 10320 12613 152156 3611 3690 3819 4017 11609 11664 11777 11850 8368 10366 12715 153124 3519 3599 3744 3973 10995 11085 11126 11191 8461 10500 12860 154952 3397 3498 3677 3948 10090 10166 10215 10340 8745 10852 13280 15965

1 3326 3453 3670 4002 9471 9523 9572 9713 9217 11439 13967 167541/2 3321 3483 3748 4134 9029 9109 9193 9290 9929 12287 14945 178811/4 3365 3571 3900 4350 8793 8857 8957 9051 10879 13409 16248 193841/6 3414 3651 4022 4508 8738 8796 8905 9038 11533 14192 17137 203911/8 3463 3724 4115 4640 8690 8788 8901 9007 12029 14785 17847 211941/10 3502 3783 4202 4743 8702 8773 8910 9010 12440 15258 18394 21847


Table III. Average Number of Required Observations under the MIM Configuration with k = 101 Systems,s = 2 Constraints, b Feasible Systems, and v = 1 Infeasible Constraints for the HAK+(B) Procedure

with the Given Ratio of β1 to β2

L/L Variance Configuration H/L Variance Configuration L/H Variance Configurationβ1/β2 b = 10 b = 51 b = 90 b = 10 b = 51 b = 90 b = 10 b = 51 b = 9010 4341 4836 4831 11973 15575 16184 9599 11444 106618 4274 4744 4732 11634 15077 15650 9533 11332 105456 4201 4638 4622 11188 14463 14974 9473 11240 104394 4095 4484 4487 10558 13647 14079 9386 11083 102652 3927 4251 4246 9636 12252 12612 9240 10857 100121 3762 4040 4033 8736 11014 11244 9101 10625 9771

1/2 3629 3857 3853 7920 9871 10035 9024 10445 95721/4 3771 4027 4014 8020 10012 10155 10031 11651 105881/6 3864 4142 4128 8091 10111 10266 10707 12401 112391/8 3934 4221 4201 8115 10181 10382 11159 12962 117091/10 3984 4296 4266 8166 10250 10407 11554 13426 12112


systems, and variance configurations. The results of the displayed choices of allocationcan depart as many as 63% more observations than the best, but our suggested errorallocation requires at most 14% more observations among all configurations we tested.We observe similar results in the DM configuration (see Tables A.1 and A.2 in theOnline Appendix), with the main difference being that the dependence of the best errorallocation on b is stronger under the DM configuration (i.e., for large b, more systemswill be found to be feasible, requiring a higher α2 to compare them efficiently).

5.1.3. Error Allocation for Simultaneous Procedures. In this section, we consider the simul-taneously running HAK+(B) procedure. Here we seek efficient and robust choices ofβ1 and β2. As in Section 5.1.2, we focus on performance, measured by the number ofrequired observations, as the ratio of the two parameters changes.

Table III shows the average number of needed observations for a configuration withk = 101 systems, s = 2 constraints, v = 1 infeasible constraint for infeasible systems,and a varying number of feasible systems. We see that a ratio of β1/β2 = 1/2 is thebest in all scenarios. This result is analogous to our findings for HAK(B), as β1 =β2/s corresponds to approximately equivalent error allocation for feasibility check and



Table IV. Average Number of Required Observations under the MIM Configuration with k = 101 Systems,s Constraints, b = 51 Feasible Systems, and v = 1 Infeasible Constraints for the HAK+(B) Procedure

with the Given Ratio of β1 to β2

L/L Variance Configuration H/L Variance Configuration L/H Variance Configurationβ1/β2 s = 1 s = 2 s = 4 s = 8 s = 1 s = 2 s = 4 s = 8 s = 1 s = 2 s = 4 s = 810 4311 4836 5454 6128 13845 15575 17423 19421 9252 11444 13965 166248 4228 4744 5335 6013 13443 15077 16864 18780 9158 11332 13788 164736 4127 4638 5225 5898 12840 14463 16300 18171 9040 11240 13712 164164 3991 4484 5060 5721 12041 13647 15329 17132 8899 11083 13544 162322 3774 4251 4812 5450 10816 12252 13875 15607 8667 10857 13301 16012

1 3589 4040 4574 5198 9659 11014 12521 14191 8480 10625 13066 15782

1/2 3716 3857 4363 4967 9762 9871 11286 12851 9447 10445 12863 15566

1/4 3877 4027 4183 4758 9892 10012 10149 11616 10548 11651 12720 153561/6 3969 4142 4302 4646 9999 10111 10258 10932 11222 12401 13511 15264

1/8 4055 4221 4405 4578 10040 10181 10336 10488 11709 12962 14124 152111/10 4113 4296 4473 4655 10109 10250 10399 10543 12122 13426 14607 15723


comparison. Unlike sequential procedures, the optimal error allocation is quite robustto b and different variance configurations.

To test the performance of β1 = β2/s as the number of constraints increases, againwe use k = 101 systems, 51 feasible systems, and one infeasible constraint for eachinfeasible system. Table IV shows that for s constraints, the appropriate allocationis β1/β2 = 1/s. So, again, even allocation between feasibility check and comparisonis preferable. As for HAK(B), we see sublinear growth in the number of requiredobservations as the number of constraints increases.

In the simultaneous procedure, HAK+(B), poor choices cost at most 88% more obser-vations than the optimal in Tables III and IV. As in Section 5.1.2, we observe similarresults in the DM configuration (see Tables A.3 and A.4 in the Online Appendix; how-ever, Table A.3 shows that the number of feasible systems b does not affect optimalallocation much in HAK+).

Ultimately, without knowing any properties of the systems ahead of time, an allo-cation that splits error evenly between feasibility check and comparison is relativelyrobust to the various possible configurations (especially in the L/L variance configu-ration). This allocation takes slightly different forms in sequential and simultaneousprocedures. When implementing the recommended ratio into our simultaneous proce-dures, we also note that β1 = β2/s corresponds to e = 1, a special case that leads toeasily solvable valid values of β1 and β2 (see Section 4).

5.1.4. Approximate Asymptotic Analysis. To further study the optimal error allocation,we analyze the asymptotic total number of observations for HAK and HAK+ as αapproaches zero. We employ heuristic arguments similar to those of Perng [1969]. LetT C

ij be the number of observations needed to compare systems i and j and T Fi� be the

number of observations needed to decide on the feasibility of system i on constraint �.Define Xi(r) = 1

r

∑rn=1 Xin and Yi�(r) = 1

r

∑rn=1 Yi�n. We consider F I

B for the feasibilitycheck and perform analysis under the assumption of known variances.

As α → 0, the following happens:

(1) A procedure in consideration stops making mistakes, both on feasibility andcomparison.

(2) Large numbers of observations are needed to make a decision, and hence Xi(r) andYi�(r) behave like xi and yi�, respectively. An infeasible constraint is found to be



infeasible when yi� − q� ≥ 1r R(r; ε�, h2

1, σ2yi�

) = η1(n0−1)σ 2yi�

rε�− ε�

2 . A similar inequality

can be derived for feasible constraints. Hence, T Fi� ≈ max{n0,

η1(n0−1)σ 2yi�

ε�(|yi�−q�|+ε�/2) }.Moreover, the number of observations needed to declare a system feasible is

max� T Fi� because all constraints need to be declared feasible. On the other hand, an

infeasible system stops taking observations if one of constraints is found infeasibleand thus takes only min{�:yi�>q�} T F

i� . Let T Fi be the number of observations needed

to detect the feasibility of system i. Then

T Fi =

{max� T F

i� , if system i is feasible,

min{�:yi�>q�} T Fi� , otherwise.

Similarly, for KN used in HAK+ and MDR, a comparison decision between systems

i and j would be made when |xi − xj | ≥ 1r R(r; δ, h2

2, σ2xij

) = η2(n0−1)σ 2xij

rδ− δ

2 . Hence

T Cij ≈ max{n0,

η2(n0−1)σ 2xij

δ(|xi−xj |+δ/2) }.For SSM used in HAK, T C

ij depends on the number of observations alreadyavailable (i.e., T F

i ). We approximate T Cij of SSM by that of KN .

(3) Consider HAK. In our configurations, infeasible systems b+1, . . . , k are eliminatedat T F

i in the feasibility check phase. Feasible systems 1, . . . , b are declared feasi-ble at T F

i and inferior to system b at T Cib . Thus, systems 1, . . . , b take additional

observations in the comparison phase only if more observations are needed forcomparison (i.e., T C

ib > T Fi ). Then, the approximate total number of observations

for HAK is

THAK ≈k∑

i=b+1

T Fi +

b−1∑i=1

max(

T Fi , min

j∈{i+1,...,b}T C

ij

)+ max

(T F

b , T C1b, . . . , T C

b−1,b

).

(4) Consider HAK+. In our configurations, infeasible systems b + 1, . . . , k are elimi-nated at T F

i . Feasible systems 1, . . . , b−1 are eliminated when the system is foundto inferior to, say, system j and system j is found feasible. Thus, the approximatetotal number of observations for HAK+ is

THAK+ ≈k∑

i=b+1

T Fi +

b−1∑i=1

minj∈{i+1,...,b}

max(T C

ij , T Fj

) + max(T F

b , T C1b, . . . , T C

b−1,b

).

The optimal allocation is the one that produces the smallest THAK or THAK+.The asymptotic total number of observations under the mean and variance configu-

rations defined in Section 5.1.1 are given in the Online Appendix and match well withthe experimental results. In addition, we tested the performance of HAK and HAK+when the primary and secondary performance measures all have different variances.More specifically, we considered s = 2 and three variances 1 (low), 3 (medium), and5 (high), which yields six different variance configurations for (σ 2

xi, σ 2

yi1, σ 2

yi2). Both the

asymptotic analysis and additional experiments support that an allocation that splitserror evenly between feasibility check and comparison is a robust choice.

5.2. Considering Common Random Numbers

In this section, we discuss the use of CRN in constrained R&S procedures to improve theefficiency of comparison. CRN could be useful with our HAK+ and MDR procedures,



Fig. 1. Sample paths of the difference of equal sums∑r

n=1(X1n − X2n) and unequal sums∑r

n=1 X1n −r

100∑100

n=1 X2n under ρx = 0.95.

proven to be valid under correlation across systems. This section also suggests whythe use of CRN within procedures such as HAK that compare systems with unequalsample sizes may not provide valid PCS.

In Section 5.2.1, we take a closer look at a difficulty in comparing correlated systemswith uneven sample sizes. Section 5.2.2 provides an analysis of the required correlationto make CRN advantageous in constrained R&S.

5.2.1. Decisions under Correlation. Although the independent simulation of systems isoften suitable, just a small amount of positive correlation can significantly improvethe efficiency of R&S procedures. This positive correlation will reduce the variance ofthe difference of samples from two systems, allowing the comparison of the systemsto be completed sooner. Usually, this positive correlation is created through the useof CRN [Law and Kelton 2000]. The increase in efficiency comes at a cost, in thatsome comparison procedures may not make valid decisions for correlated systems, andBonferroni bounds are used to ensure validity of the selection (see Theorems 4.8 and4.9).

Two of our R&S procedures for multiple constraints (i.e., HAK+ and MDR) are validwith correlation across systems, as is shown in Section 4. The reason is that HAK+and MDR always compare systems at equal sample sizes, and they do so with the KNprocedure. The KN procedure makes statistically valid decisions for both independentand correlated systems [Kim and Nelson 2001], and thus satisfies Assumption 6.

However, procedures that compare systems at unequal sample sizes, like HAK andits underlying selection procedure SSM of Pichitlamken et al. [2006], may not pro-vide adequate PCS results under correlation. The problem lies with obtaining goodestimates of variability under correlation. Fully sequential procedures, like SSM, usethe quantity S2

Xij= 1

n0−1

∑n0n=1(Xin − Xjn − [Xi − Xj])2 to estimate the variance of the

difference between two systems, i and j, where Xi, Xj are the first-stage sample meansfor system i, j, respectively. This variance estimate allows the procedure to utilize thebenefits of positive correlation but only accurately represents the variability of thedifference of sums under a common sample size r: M(r) = ∑r

n=1(Xin − Xjn). Underunequal sample sizes ri < rj with r = min(ri, rj) and high correlation, the statisticU (r) = r

ri

∑rin=1 Xin − r

rj

∑rj

n=1 Xjn used by SSM can have a much higher variance than

computed by S2Xij

, as the variability is driven by the lagging system’s data points.Let ρx denote the correlation across primary performance measure samples. Figure 1

draws sample paths of the difference of sums with equal sample sizes M(r) and unequalsample sizes U (r) for two systems with equal variance when E[X1n] = 0.25, E[X2n] = 0,



and the correlation is high, namely ρx = 0.95. It is clear that the unequal sums expe-rience a much higher variability. The underestimation in S2

Xijof the variability of the

unequal sums U (r) could lead to incorrect decisions. Thus, without adjustments to thecomparison algorithm within HAK, it is unclear that one can use this procedure tocompare correlated systems. Lesnevski et al. [2007] also raises a similar issue aboutthe use of CRN when unequal sample sizes across systems are available in screening.Further study of this topic falls outside the scope of this article, but some methods arepresented in Healey et al. [2014], who propose heuristic modifications that allow pro-cedures with unequal sample sizes such as HAK to benefit from CRN while satisfyingthe desired level of PCS.

5.2.2. Required Correlation. Having shown that HAK+ and MDR make valid decisionsunder correlation, we now look at how much correlation is necessary to overcomethe conservative Bonferroni bound required for proving the validity of these proce-dures. The main difference between the independent and correlated cases in HAK+and MDR lies in the smaller β1 and β2 needed when CRN are used (see (2) and(5)). If the positive correlation is not strong enough, our valid procedures with cor-related systems may require more observations than with independently simulatedsystems.

To analyze the difference between the independent and correlated systems, we let themeans be in the DM configuration with b = k (so that yi� = −ε� for all i = 1, . . . , k and� = 1, . . . , s and xk = xi +δ for i = 1, . . . , k−1). We also set the variance of the differenceXin − Xjn to the same value σ 2

X for all pairs of systems i �= j and set an equal varianceσ 2

Y�across systems for all secondary performance measures Yi�n. Thus, comparison and

feasibility check for different systems will be based on the same expected continuationregions in our procedures (i.e., a triangular region defined by +R(·) and −R(·)), and wecan focus on two systems. One measure of the relative difficulty of constrained R&Swould be a weighted sum of the expected area of the continuation regions for feasibilitycheck and comparison, as a smaller sum would associate with quicker completion timesthan a larger sum. We understand that this measure is not perfect. However, it can becalculated without information on the feasibility of individual system constraints andprovides useful insights on the magnitude of the required correlation. The requiredcorrelation ρx is given in (6). The details of the derivation are provided in the OnlineAppendix.

ρx > 1 −

√√√√√ η22

η′22

+η2

1−η′21

s

∑s�=1

σ 4Y�

ε3�

η′22

σ 4X

δ3

. (6)

For example, if we let n0 = 10, 1 − α = 0.90, k = 7, which correspond to worst-caseparameter choices in Kim and Nelson [2001] for unconstrained selection, and s = 2,

β1 = β2/s, and σ 4X

δ3 = σ 4Y�

ε3�

for � = 1, 2, then the resulting η1, η2, η′1, η′

2 can be found. Usingthese settings, (6) shows that the weighted sum of the expected areas of the continua-tion regions under correlation becomes smaller than the corresponding weighted sumunder independent sampling with just ρx > 0.0127. Thus, very little correlation maybe needed to produce quicker overall completion times for our HAK+ and MDR pro-cedures under CRN. This is in part because the difference between correlated η’s andindependent η’s is smaller in the procedures for constrained selection than in simpleunconstrained selection due to the use of Bonferroni bounds to account for correlatedprimary and secondary performance measures.



Table V. Observed PCS for k = 5 Systems with s = 5 Constraints,b = 3 Feasible Systems, and v = 1 Violated Constraints

DM MIML/L H/L L/H L/L H/L L/H

HAK(B) 0.983 0.973 0.994 0.990 0.996 0.995HAK(A) 0.981 0.973 0.993 0.990 0.996 0.994HAK+(B) 0.975 0.973 0.972 0.983 0.995 0.980HAK+(A) 0.973 0.971 0.970 0.982 0.995 0.978MDR(B) 0.975 0.973 0.974 0.984 0.995 0.980MDR(A) 0.974 0.971 0.971 0.982 0.995 0.979

6. EXPERIMENTAL EVALUATION AND COMPARISON OF PROCEDURES

We now present experimental results to illustrate the comparative performance of ourconstrained R&S procedures. The experimental setup is as described in Section 5.1.1.The choice of b = (k + 1)/2 was shown to minimize PCS of simultaneously runningprocedures in Andradottir and Kim [2010], so we feature b = (k + 1)/2 throughout.We choose α1 = α2 and β1 = β2/s as our error allocations, as featured in the previoussection, selecting valid values of α2 or β2.

In Section 6.1, we demonstrate the validity of our procedures empirically. Section 6.2compares the procedures in terms of the number of required observations. Section 6.3provides additional results to show how the number of required observations can de-pend on the number of constraints. Finally, we study the effectiveness of CRN inSection 6.4.

6.1. PCS

We are interested in inspecting the PCS for both valid and heuristic procedures withboth types of feasibility checks, F I

B and F IA. In Table V, we present PCS results for a

small number of systems (k = 5) with five constraints. We choose v = 1, because thefeasibility check is easier when v is large. In addition, as k increases, R&S proceduresusually become more conservative, so a setup with a small number of systems k andviolated constraints v promises to be challenging in terms of validating PCS. We coverboth the DM and MIM configurations with L/L, H/L, and L/H variances.

We note that the observed PCS for all procedures, valid or heuristic, lies above thenominal 0.95. The heuristic HAK can be conservative for both DM and MIM config-urations. The simultaneously running procedures, HAK+ and MDR, tend to be lessconservative, except in the H/L case where it is unlikely that HAK takes too manysamples because feasibility check is dominated by hard comparison. There is a smalldecrease in PCS between the valid feasibility check F I

B and the heuristic version of F IA,

but not enough to discourage use. In fact, the observed PCS of all of our experimentswill lie above 0.95, so we will not feature PCS any further.

6.2. Required Number of Observations

We wish to compare the effectiveness of our procedures and feasibility check optionsin terms of the required number of observations. In Tables VI and A.21 (provided inthe Online Appendix), we display the average number of required observations for allcombinations of our procedures considering a large number of systems, k = 101, withs = 5 constraints. The number of violated constraints for each infeasible system isv = 1 in Table VI and v = 5 in Table A.21.

Tables VI and A.21 show that MDR outperforms HAK and HAK+ in all cases, doc-umenting the desirable effects of dormancy. Moreover, HAK+ performs better thanHAK in all cases, except for the MIM configuration under the L/L and H/L variance



Table VI. Average Number of Required Observations for k = 101 Systems withs = 5 Constraints, b = 51 Feasible Systems, and v = 1 Violated Constraints

DM MIML/L H/L L/H L/L H/L L/H

HAK(B) 27560 68856 131026 3769 9713 14845HAK(A) 27536 68835 130921 3768 9713 14843HAK+(B) 23552 67236 92782 4298 10343 13515HAK+(A) 23523 67208 92631 4298 10343 13514MDR(B) 21208 67146 60506 3563 9667 8639MDR(A) 21179 67118 60360 3562 9667 8637

configurations where feasibility check is relatively easy. The biggest difference in per-formance is seen when feasibility check is hard, where MDR outperforms HAK andHAK+ by at least 30%, sometimes more.

The relative performance of the feasibility check options does not depend heavily onour procedures, HAK, HAK+, and MDR, but highly depends on the number of violatedconstraints. Under v = 1, the performance of procedures with F I

B is similar to that ofprocedures with F I

A. This is expected, as aggregation is not very helpful when only oneor two constraints are violated. This changes in Table A.21. F I

A is significantly superiorin all cases when v = 5, and the savings over F I

B ranges from 5% to 40%, depending onthe relative difficulty of the feasibility check.

The performance of the procedures across the tables also indicates that a constrainedR&S problem with v = 5 is easier than when v = 1, requiring at least 5% fewerobservations. When the feasibility check is relatively more difficult than comparison,this effect is more pronounced, with v = 5 requiring at least 15% fewer observationsfor HAK, 14% fewer observations for HAK+, and 23% fewer observations for MDR.The reason is that when the infeasible systems violate v = 5 constraints, the feasibilitycheck ends as soon as the first of these constraints is found infeasible (the minimum ofthe five screening completion times). By contrast, if v = 1, the feasibility check can beended only by the one infeasible constrained performance measure.

6.3. Cost of Additional Constraints

In this section, we investigate the cost of additional constraints, as users may be inter-ested in learning how many more (or less) observations would be needed to considerextra performance measures. In Section 5.1, a sublinear increasing trend in observa-tions was seen as the number of constraints increases. In addition, Section 6.2 indicatesthat a large number of violated constraints v usually yields a quicker completion timethan small v.

Therefore, our experiments consider the two cases, v ∈ {1, s}, for each number of con-straints, s ∈ {1, . . . , 5}, so that all infeasible systems violate either only one constrainedperformance measure or every constrained performance measure. It is reasonable thatmost results will fall between these two cases. To increase the difference between thecases, we will implement F I

B when v = 1 and F IA when v = s, as Table A.21 shows F I

Ato be particularly efficient when v is high.

Figures 2 and 3 plot the required number of samples for each of our three proceduresin the DM and MIM configurations, respectively, and under the L/L variance configu-ration, where comparison and feasibility check have similar difficulties. The two linesplotted show the necessary observations under the favorable case where v = s and F I

Ais implemented and the more difficult case where v = 1 and F I

B is implemented. In thecase when v = 1, we observe an increase in the number of required observations as sincreases, but this increase is sublinear, as in Section 5.1.



Fig. 2. Average required number of observations as a function of the number of constraints for the DM withL/L configuration considering k = 101 systems with b = 51. The top line corresponds to F I

B under v = 1,whereas the bottom line corresponds to F I

A under v = s.

Fig. 3. Average required number of observations as a function of the number of constraints for the MIMwith L/L configuration considering k = 101 systems with b = 51. The top line corresponds to F I

B under v = 1,whereas the bottom value corresponds to F I

A under v = s.

When v = s, for all of the procedures and configurations (except HAK+ in MIM),we actually see an initial decrease in the number of observations. This is due to amuch faster feasibility check, as screening stops once the first infeasible performancemeasure is identified. Figures 2 and 3 feature a growing difference between the cases(up to 15%) as the number of constraints grows. Thus, the introduction of additionalconstraints can influence the performance of the algorithms significantly, but moreconstraints may not necessary mean more samples (this conclusion is based on systemswith no more than five constraints; additional study of systems with larger numbersof constraints would be desirable but is outside the scope of the current article).

It is noted that fully sequential selection procedures are useful up to 1,000 competingsystems (e.g., see Nelson [2010]). In the presence of constraints, the total number ofperformance measures is k(s + 1), as opposed to k for unconstrained problems. Thissuggests that it is reasonable to expect that our procedures will be useful when k(s+1) ≤1,000 (and we consider examples with k(s + 1) = 909 in Tables II and IV). However, thesublinear property of additional constraints supports that the proposed procedures inthis article may still be useful when k(s + 1) > 1,000.

6.4. Common Random Numbers

We investigate the savings experienced through the use of CRN. As the positivecorrelation induced by CRN reduces the variances involved in comparison but notthe feasibility check, we expect to see significant savings in the number of required



Table VII. Average Number of Required Observations for k = 101 Systems with s = 5 Constraints,b = 51 Feasible Systems, v = 1 Violated Constraints, and Differing Levels of Correlation,

ρx, across Systems All procedures utilize F IB for feasibility check.

DM MIMMDR(B)ρx L/L H/L L/H L/L H/L L/HMDR(B) 21208 67146 60506 3564 9667 8638MDR(B)0.00 21308 68423 61730 3571 9834 8649MDR(B)0.01 21246 65712 61116 3541 9498 8593MDR(B)0.02 21149 65032 60246 3517 9407 8568MDR(B)0.05 20884 64251 60007 3482 9240 8538MDR(B)0.10 20260 62255 59494 3445 9162 8509MDR(B)0.25 19018 52086 57496 3299 7645 8330MDR(B)0.50 15794 37080 53647 3022 5659 8147MDR(B)0.75 13110 24254 50141 2855 3873 7852MDR(B)0.90 10860 15985 47500 2776 3060 7772

samples, albeit smaller than that observed in the pure comparison of Kim and Nelson[2001]. We inspect use of CRN in MDR, which is valid under CRN.

Recall that ρx denotes the correlation across systems’ primary performance measuresamples, which we will test at varying levels to measure the effects of CRN. We let thesecondary performance measure samples be independent across systems. Practically,correlation will occur, and the values of β1 and β2 will be more conservative whenconsidering correlated systems (rather than independent systems), and hence the useof CRN may not always be beneficial. However, as feasibility check procedures runseparately for each system, the magnitude of correlation in secondary performancemeasure samples across systems should not have a major impact.

We present Tables VII and A.22 (in the Online Appendix) as examples of the exper-imental performance of the valid procedure MDR under varying levels of correlationsacross systems. The configuration tested in Table VII features s = 5 constraints withv = 1 violated constraints for infeasible systems, corresponding to a difficult feasibilitycheck. Table A.22 displays results for systems with s = 5, but v = 5 infeasible systems.The first line for each procedure indicates the number of required observations whensystems are independent under valid parameters chosen according to (2), whereas allother procedures experience some level of correlation ρx as represented by a super-script, MDR(·)ρx , and operate under valid parameters chosen as in (5) to account forpossible correlation. In all cases, the results that improve on the independent case areshown in italics.

Table VII clearly shows that as ρx increases, the number of required observationsfor MDR decreases. Although all correlations ρx ≥ 0.02 show improvement, utilizingCRN can provide substantial savings under high correlation, reducing the number ofrequired samples by as much as 75% under hard comparison.

In the v = s configurations, Table A.22 shows similar results. As the feasibility checkfor inferior systems requires less observations, we observe substantial savings over theindependent case, as long as ρx ≥ 0.1 in most cases. We note that the level of correlationneeded to observe savings is similar when v = 1 and v = 5, except for the DM-H/L andMIM-L/L configurations. In the MIM-L/L configuration, many of the systems makefeasibility and comparison decisions in the first stage of sampling (especially whenv = 5), limiting the need and usefulness of CRN.

In summary, although MDR(B)0.00 performs worse than MDR in Tables VII andA.22 due to the use of valid parameter values for correlated systems being appliedto independent systems, the observed positive correlation required to provide savings



in the number of required samples is low—less than ρx = 0.1 in most cases. This isconsistent with our results in Section 5.2.2. Thus, CRN is an effective approach toimprove the efficiency of simultaneously running constrained R&S procedures.

7. CONCLUSION

In this article, we present and analyze three fully sequential R&S procedures forfinding the best simulated system that also satisfies constraints on multiple secondaryperformance measures. These procedures are combined with two valid feasibility checkapproaches, leading to six different methods for solving the general constrained R&Sproblem. We show that two of the procedures, HAK+ and MDR, are statistically validfor independent or correlated systems, whereas the third procedure, HAK, may be agood heuristic option for independent systems.

With regard to experimental design and implementation, we identify two majorissues, namely error allocation and use of CRN. In our experimental results, we findthat allocating error evenly between feasibility check and comparison performs well.We also show that CRN can effectively reduce the number of observations required forcomparison for procedures that compare systems with equal sample sizes, even undera small amount of correlation.

Our experimental results also show that the number of required observations growsat most sublinearly as the number of constraints increases, but in some cases, thenumber of observations decreases due to an easier feasibility check. All procedureshave their advantages; however, our results suggest that MDR implemented with theF I

A feasibility check may overall be the best choice.

ELECTRONIC APPENDIX

The electronic appendix for this article can be accessed in the ACM Digital Library.

REFERENCES

Sigrun Andradottir and Seong-Hee Kim. 2010. Fully sequential procedures for comparing constrained sys-tems via simulation. Naval Research Logistics 57, 5, 403–421.

Demet Batur and Seong-Hee Kim. 2010. Finding feasible systems in the presence of constraints on multipleperformance measures. ACM Transactions on Modeling and Computer Simulation 20, 3, Article 13.

Chun-Hung Chen. 1996. A lower bound for the correct subset-selection probability and its application todiscrete-event system simulations. IEEE Transactions on Automatic Control 41, 8, 1227–1231.

Hsiao-Chang Chen, Chun-Hung Chen, and Enver Yucesan. 2000. Computing efforts allocation for ordinaloptimization and discrete event simulation. IEEE Transactions on Automatic Control 45, 5, 960–964.

Stephen E. Chick. 2006. Subjective probability and Bayesian methodology. In Shane G. Henderson andBarry L. Nelson (Eds.), Handbooks in Operations Research and Management Science (Vol. 13). Elsevier,225–257.

Stephen E. Chick and Koichiro Inoue. 2001. New procedures to select the best simulated system usingcommon random numbers. Management Science 47, 8, 1133–1149.

Edward J. Dudewicz and Siddhartha R. Dalal. 1975. Allocation of observations in ranking and selection withunequal variances. Sankhya 37, 1, 28–78.

Peter I. Frazier and Aleksandr M. Kazachkov. 2011. Guessing preferences: A new approach to multi-attributeranking and selection. In Proceedings of the Winter Simulation Conference (WSC’11). 4324–4336.

Peter I. Frazier, Jing Xie, and Stephen E. Chick. 2011. Value of information methods for pairwise samplingwith correlations. In Proceedings of the Winter Simulation Conference (WSC’11). 3979–3991.

Mark Hartmann. 1991. An improvement on Paulson’s procedure for selecting the population with the largestmean from k normal populations with a common unknown variance. Sequential Analysis 10, 1–2 (1991),1–16.

Christopher M. Healey, Sigrun Andradottir, and Seong-Hee Kim. 2013. Efficient comparison of constrainedsystems using dormancy. European Journal of Operational Research 224, 2, 340–352.

Christopher M. Healey, Sigrun Andradottir, and Seong-Hee Kim. 2014. A Minimal Switching Procedure forConstrained Ranking and Selection under Independent or Common Random Numbers. Technical Report.



Susan R. Hunter, Chun-Hung Chen, Raghu Pasupathy, Nugroho Artadi Pujowidianto, Loo Hay Lee, andChee Meng Yap. 2011. Optimal sampling laws for constrained simulation optimization on finite sets:The bivariate normal case. In Proceedings of the Winter Simulation Conference (WSC’11). 4294–4302.

Susan R. Hunter and Raghu Pasupathy. 2013. Optimal sampling laws for stochastically constrained simu-lation optimization on finite sets. INFORMS Journal on Computing 25, 3, 527–542.

Alireza Kabirian and Sigurdur Olafsson. 2009. Selection of the best with stochastic constraints. In Proceed-ings of the Winter Simulation Conference (WSC’09). 574–583.

Seong-Hee Kim and Barry L. Nelson. 2001. A fully sequential procedure for indifference-zone selection insimulation. ACM Transactions on Modeling and Computer Simulation 11, 3 (July 2001), 251–273.

Seong-Hee Kim and Barry L. Nelson. 2006. Selecting the best system. In Shane G. Henderson and Barry L.Nelson (Eds.), Handbooks in Operations Research and Management Science (Vol. 13). Elsevier, 501–534.

Averill M. Law and David M. Kelton. 2000. Simulation Modeling and Analysis (3rd ed.). McGraw-Hill HigherEducation.

Loo Hay Lee, Ek Peng Chew, Suyan Teng, and David Goldsman. 2010. Finding the non-dominated Paretoset for multi-objective simulation models. IIE Transactions 42, 9, 656–674.

Vadim Lesnevski, Barry L. Nelson, and Jeremy Staum. 2007. Simulation of coherent risk measures basedon generalized scenarios. Management Science 53, 11, 1756–1769.

Douglas J. Morrice and John C. Butler. 2006. Ranking and selection with multiple “targets.” In Proceedingsof the Winter Simulation Conference (WSC’06). 222–230.

Barry L. Nelson. 2010. Optimization via simulation over discrete decision variables. TutORials in OperationsResearch (Vol. 7). INFORMS, 965, 193–207.

Barry L. Nelson and Frank J. Matejcik. 1995. Using common random numbers for indifference-zone selectionand multiple comparisons in simulation. Management Science 41, 12, 1935–1945.

Edward Paulson. 1964. A sequential procedure for selecting the population with the largest mean from knormal populations. Annals of Mathematical Statistics 35, 1, 174–180.

S. K. Perng. 1969. A comparison of the asymptotic expected sample sizes of two sequential procedures forranking problem. Annals of Mathematical Statistics 40, 6, 2198–2202.

Juta Pichitlamken, Barry L. Nelson, and L. Jeff Hong. 2006. A sequential procedure for neighborhoodselection-of-the-best in optimization via simulation. European Journal of Operational Research 173, 1,283–298.

Nugroho Artadi Pujowidianto, Loo Hay Lee, Chun-Hung Chen, and Chee Meng Yap. 2009. Optimal computingbudget allocation for constrained optimization. In Proceedings of the Winter Simulation Conference(WSC’09). 584–589.

Yosef Rinott. 1978. On two-stage selection procedures and related probability-inequalities. Communicationsin Statistics: Theory and Methods 7, 8, 799–811.

Thomas J. Santner and Ajit C. Tamhane. 1984. Designing experiments for selecting a normal populationwith a large mean and a small variance. In Thomas J. Santner and Ajit C. Tamhane (Eds.), Design ofExperiments—Ranking and Selection: Essays in Honor of Robert E. Bechhofer. Marcel-Dekker, 178–198.

Roberto Szechtman and Enver Yucesan. 2008. A new perspective on feasibility determination. In Proceedingsof the Winter Simulation Conference (WSC’08). 273–280.

Received February 2012; revised December 2013; accepted December 2013


selection procedures for simulations with multiple constraints under independent and correlated...

Documents