the model-robustness and optimality of randomized designs

Journal of Statistical Planning and Inference 23 (1989) 371-379

North-Holland

371

THE MODEL-ROBUSTNESS AND OPTIMALITY OF RANDOMIZED DESIGNS

Thomas MATHEW and Dulal Kumar BHAUMIK

University of Maryland Baltimore County, Catonsville, MD 21228, U.S.A

Received 23 October 1987; revised manuscript received 29 August 1988

Recommended by C.S. Cheng

Abstract: The concept of model-robustness introduced by Wu (1981) is investigated for a class

of designs relative to a general optimality criterion, in the setup of Li (1983). A minimaxity result

is established, which demonstrates that, for certain types of model inadequacies, randomization

validates model based optimality results for many designs, including block designs and row-

column designs. The class of optimality criteria considered includes the A- and E-criteria, but not

the D-criterion. A counter-example shows that randomization does not guarantee model-

robustness of the D-optimality property.

AMS Subject Classification: Primary 62K05; secondary 62C20, 62F35

Key words and phrases: Y-optimality; uniform randomization; model-robustness; minimaxity.

1. Introduction and summary

In the context of comparative experiments, a concept of model-robustness has

been introduced by Wu (1981). The results of Wu (1981) and their generalizations

by Li (1983) demonstrate that the use of randomization in the design of such ex-

periments can guarantee robustness against certain types of model inadequacies.

Even though the role of randomization in design of experiments is well recognized

in literature, a rigorous theoretical investigation of the model-robustness aspect of

randomization was undertaken for the first time by the above authors. In the con-

text of optimal designs, their results show that the A-optimality property of many

designs is robust against certain model violations, provided randomization was

employed while designing the experiment. For survey articles dealing with ran-

domization, we refer to Kempthorne (1975, 1977) and Folks (1984).

In the theory of optimal designs, various optimality criteria have been introduced

to judge the ‘goodness’ of a design for a specific inference problem. Some authors

(Kempthorne (1977), Srivastava (1984)) have questioned such optimality considera-

tions, since simple additive models (model (2.1) in the next section) may not always

adequately explain the variability among the treatments and the experimental units

in practical situations. Furthermore, the experimental errors may behave differently

0378.3758/89/$3.50 0 1989, Elsevier Science Publishers B.V. (North-Holland)

312 T. Mathew, D.K. Bhaumik / Model-robustness of designs

from one experimental unit to the other. Thus, they may have different variances

and may be correlated. Model (2.2) in the next section takes into account the above

inadequacies in (2.1). Since the exact nature of model violations is usually unknown

to us, (2.2) actually specifies a class of models, one of them being the correct model.

Thus, in Section 2, we investigate the performance of an optimal design (computed

under (2.1)), when (2.2) is, in fact, the correct model, for a class of optimality

criteria. Our main result in this section (Theorem 2) is that a design that is optimal

under an inadequate model is minimax or model-robust (as defined later) under the

correct model, provided randomization was employed while designing the experi-

ment. The result is established in the set up of Li (1983) and consequently applies

to single factor designs, block designs and row-column designs.

The class of optimality criteria that we have considered includes the familiar A-

and E-criteria, but not the D-criterion. It turns out that randomization does not en-

sure minimaxity (or model robustness) as far as D-optimality is concerned. We show

this by considering a BIBD, in the set up of Theorem 2. It is shown that, for D-

optimality, the given non-randomized BIBD is superior to a randomization of the

same. This counter-example shows that randomization may or may not guarantee

robustness against model inadequacies depending on the criterion employed. Thus,

a rigorous theoretical justification of randomization, as done by Wu (1981) and Li

(1983), is indeed necessary.

2. Minimax randomized designs under a general optimality criterion

Our set up is the same as that of Theorem 3.2 in Li (1983). Thus, for comparing

u treatments, the design under consideration has b blocks and the arrangement of

the units in the blocks is based on n factors. We assume that the number of levels

of the i-th factor is Ii in any block and the number of units in any block is

k= ny= 1 lj. The arrangement of units is such that each block forms a hyper-

rectangle of size Ii x f, X ... x I,, when n 2 2. When n = 1, the k= I, units are

assumed to be of the same level, in each block. Thus the total number of experimen-

tal units in the design is N= bk. We will use the symbol d to denote such a design.

If y, is the yield from the u-th unit (assumed to be in the s-th block), the usual

linear model for analyzing such a design is

n

Yu=go+Q(u)+&+ j;l Y{Z . . . . I; ,..., in)+%.

Here go is an overall mean (independent of u), 7dsd(ul is the effect of the treat-

ment which appears in unit u (for the design d), & is the s-th block effect and

y{;!,..,,.,i,, is the interaction effect of all but the j-th factor in block s at levels . .

’ il,...,lj_l, Ij+l,...,I,. Here, all the effects are assumed to be fixed. The E,‘S are

assumed to be zero mean uncorrelated random variables having a common variance

02. As pointed out by Li (1983), the models for single factor designs, block designs

T. Mathew, D.K. Bhaumik / Model-robustness of designs 373

and row-column designs are special cases of the above model. In matrix notation,

the model can be written as

Y=gol,,+&~+XpP+ E ubOXj)Yj+~ (2.1) j=l

where y is the vector of observations, lb, is a bk x 1 vector of ones, X, and XD are

the respective design matrices for the treatment effects T and block effects /? and

Zb@Xj is the design matrix for the vector of interaction effects of all but the j-th

factor in the various blocks (denoted by rj). We note that

Xp=Z,@lk and Xj=Z,,O ... @Zb_,@l,,@Zbl,@ ... @Zln.

If the assumptions about the 8,‘s are in suspect, then the E,‘S may have nonzero

means and different variances depending on u and may be correlated. The model

in this case is

Y=g+xTZ+Xflp+ i (Zb@Xj)Yj+&* j=l

(2.2)

Here g is an Nx 1 vector (N= bk) of unknown parameters. We now assume that

E has zero mean and covariance matrix o2 I/. It is further assumed that ge F? and

P’E “Y, where %J and Vare respectively compact subsets of RN and the set of real

symmetric matrices. It is also assumed that $4 contains a vector with all components

equal and “Ycontains the identity matrix. In other words, (2.1) is a possible model.

In what follows (2.1) and (2.2) will refer to the above models along with the cor-

responding assumptions.

We impose a certain invariance condition of F? and Vwhich essentially reflects

our ignorance about the exact nature of model violations. Let ZZt denote the per-

mutation group on li symbols which permute the Ii levels of the i-th factor, when

they occur in block s and let Hb denote the permutation group on { 1,2, . . . , b}. A

permutation in Hb permutes the blocks. The group generated by all the Hf and Hb

will be denoted by H,,. An element in Ho is a permutation on N symbols where

N= bk, the total number of experimental units. The invariance condition on $i? and

V, as described in Li (1983), is the following:

(A) If rrO~ Ho, ge $3 and I/= ((Q)) E W, then rcogE 9 and rco I/E “Y, where

(7~0 I’)G=~,,l;,~olj and (nog)i=g,,I;.

Throughout this paper we will be referring to the above condition as assumption

(A). We note that (2.2) in conjunction with assumption (A) specifies a class of

models, only one member in this class being the correct model.

Given a design d, let SB denote the set of all possible permutations of the units

in d according to the permutations in Ho. Any design in CB will be referred to as a

permutation of d. A randomization of d is a design selected from S7 according to

a probability distribution. Thus, we can denote a randomized design by the pair


(~,d), where rl is a probability distribution on g. According to this notation, if

di E 9, then (~,d) and (q,dr) denote the same design. Also, 4 will denote the

uniform distribution on GB. The randomization of d according to rf will be referred

to as the uniform randomization of d.

The inference problem under consideration is the estimation of Lz, where z is the

o-component vector of treatment effects and L is a known matrix. Lz is assumed

to be estimable under the model (2.1). Let Fdy denote the least squares estimator

of LT under (2.1), when design d is used. If the model (2.1) is correct, then the

dispersion matrix of Fdy is 02FdF& and according to optimal design theory, we

should choose a design d that minimizes the scalar valued function Y(F,F&). The

function Y, defined on the set of nonnegative-definite matrices, is assumed to

satisfy the following conditions:

(i) (ii)

(iii)

Y is convex;

Y is nondecreasing, i.e., if El, E2 and E, - E2 are nonnegative

definite, then Y(E,) 2 Y(E,); (2.3) if Y(E,) 2 Y(E,) for E, , E2 nonnegative definite, then

Y(aE,) 2 Y(aE2) for a > 0.

The above optimality criterion is similar to the weak universal optimality criterion

mentioned in Kiefer and Wynn (1981, p. 740). As observed by these authors, the

criteria covered by this definition include the familiar A- and E-optimality criteria

(and some other criteria), but not the D-optimality criterion. A design which

minimizes Y(FdF&), for any such Y, will be referred to as a Y-optimal design.

Clearly, randomization has no role in such optimality considerations if (2.1) is the

correct model.

We shall now try to assess the performance of the estimator Fdy (from the point

of view of Y-optimality) when model violations are present (more specifically, when

the model (2.2) holds). Since Fdy is no longer an unbiased estimator of LT, we

compute the matrix of mean squared errors W(d, S) = E(F, y - Lz)(F, y - Lz)‘,

where we write s = (g, V) E FJ x %‘= 9. We note that the matrix W(d, S) depends only

on the parameters in s and not on the other parameters in (2.2), since F,y is un-

biased for Lz under (2.1). The performance of a randomized strategy (q,d) is

evaluated using M(q, d,s) = E,, W(d,s). We note that M(q, d,s) also involves cr2,

since, under (2.2), COV(E) = o2 V, VE W. We do not specify this in M(q,d,s), since

a2 does not play any role in our results.

Definition. The randomized design (q *, d *) is said to be minimax or model-robust if

F; mEa; Y(M(rl, d, s)) = mEa; Y(M(r *, d *, s))

for any Y satisfying (2.3) and for every o’>O.

It should be mentioned that the criterion used by Li (1983) is similar to

E,, [ Y( W(d, s))], which is different from our criterion in the above definition. Since,

7. Mathew, D.K. Bhaumik / Model-robustness of designs 375

in general, there does not exist a randomized design which ‘minimizes’ the matrix

M(q, d, s) for any s (in the usual Loewner ordering of nonnegative definite matrices),

we are trying to minimize a scalar valued function of the above matrix, as is usually

done in optimal design theory. However, with our criterion, it does not appear

possible to prove a result as general as Theorem 2.1 of Li (1983). But the model-

robustness problem can be settled for a general optimality criterion satisfying (2.3).

The theorem proved below is similar to (but not a special case of) Theorem 2.1 in

Li (1983) or Proposition 1 in Wu (1981). We also mention that the theorem is not

true for the D-optimality criterion.

For s = (g, V) E 9 and rco E Ho, we write rcos = (rrog, rro V). Then assumption (A)

implies nos E 9 whenever s E 9.

Theorem 1. Suppose assumption (A) holds and Y satisfies (2.3). Then, for any design d and any probability distribution q on g.

yGa; Y(Mr7, d, s)) 5 max Y(WI~, d, s)), ,c s t 9‘)

where rf is the uniform distribution on g.

Proof. Using the convexity of Y, the proof is essentially an imitation of the proof

of Proposition 1 in Wu (198 1). For rco E Ho and for any probability distribution q

on S?, define the probability distribution qno as q,,(d,) =q(zOdl), for dl E 97. Then

ul=h,’ C~,~EH,,YI~ “, where ho is the order of Ho. By using the linearity of M(v],d,s) in q and the convexity of Y, we get

mEa; Y(M(rf, d, s)) 5 h’ C max Y(M(v~,, 4 s)) 0 II,, t N,) s E .v

- i. n”L” SG.‘y max Y(M(q, d, zos)) = max Y(M(q, d, s)).

st.i/

In order to conclude the last two equalities, we employed the facts

M(rln,, 4 s) =M(v?,,,, nod, 7~0s) = M(r, d, n,s) and

mEa; Y(M(r;l, d, ~0s)) = mEa; Y(M(uI, 4 s)),

using assumption (A). 0

The main theorem in this section is the following.

Theorem 2. Under the model (2. l), let Lr be an estimable parametric function and let do be a Y-optimal design for estimating Lz, for any Y satisfying (2.3). Then, under the model (2.2), if assumption (A) holds, the uniformly randomized design (?j, do) is minimax.

For the A-optimality criterion, Li (1983) has proved Theorem 2 in a very general


set up. In order to prove Theorem 2, we shall use the following lemma, which is

proved in the appendix.

Lemma 1. Let P be a permutation matrix representing a permutation in HO. Then for any real symmetric matrix A, Fd(Cp PAP’)F& = A(A)F,F&, where A(A) is a scalar not depending on d.

Proof of Theorem 2. The proof is based on arguments similar to those in the proof

of Theorem 3.2 in Li (1983, p.238). The desing do in the theorem minimizes

Y(F,Fi), where Fdy is the least squares estimator of LT. Under (2.2),

W(d,s) = E(Fdy-Lr)(Fdy-L@‘= Fd(02V+gg’)F&.

From Lemma 1, we get M(rf, d, s) = Ev W(d, s) = I,(s)F,F&, where &(s) is a scalar

not depending on d (&,(s) depends on a2).

From the proof of Lemma 1 in the appendix, it follows that for every 02, there

exists sO E S, satisfying max,,, &,(s) =&,(sr,). Applying Theorem 1 and using the

fact that Y satisfies (2.3), we get

= meat Y(lO(s)FdF$) = Y(ilO(sO)FdF&), using 2.3(ii)

2 ~(&(s,)&&,), using 2.3(iii) and the

Y-optimality of do

= yea: WW4,do, @I,

which completes the proof. q

Theorem 2 immediately gives the model-robustness of many well-known optimal

designs. Thus the A- and E-optimal block designs and Youden squares, for compar-

ing treatments, given in Kiefer (1975) and in Cheng (1978) are model-robust. So are

the A- and E-optimal designs, for comparing treatments with a control, given in Ma-

jumdar and Notz (1983) and Hedayat and Majumdar (1985). We now give a counter-example to show that uniform randomization does not

guarantee the model-robustness of the D-optimality property. For this, we consider

a BIBD for comparing 3 treatments A, B, C in 6 blocks as follows: blocks 4, 5 and

6 are the same as blocks 1, 2, and 3 respectively, given below.

Blockl: AB m Block 2: B C m

Block3: A C m


If all treatments are of equal interest and if there are no model violations, the D-

optimality criterion (for comparing the treatments) is the product of the nonzero

eigenvalues of C’ (the Moore-Penrose inverse of the C-matrix). Consider model-

violations of the following type: the set g consists of vectors of the form Ae;

(A # 0), where the e;‘s are the unit vectors in R l2 (the design has 12 plots) and W=

{ a21 : o2 > O}. Thus the experimental errors are uncorrelated random variables

having common variance a2, but the assumption that they have zero means is

violated. Clearly, with ?? and Yas defined above, assumption (A) is satisfied. For

a systematic design (i.e. non-randomized), the D-optimality criterion is the product

of the nonzero eigenvalues of Cd+Bd(02Z+gg’)B&d+ (see appendix for notations

and further details). If we use the above systematic BIBD with g=Ae; (for any i),

then using the expressions for B, and C$ for a BIBD, it follows that the D-

criterion has value +02(a2+hL2). But, if we employ uniform randomization,

then by using (3.8) in the appendix, it follows that the criterion has value +(02+

8A2)2, which is strictly greater than the previous value for any J. #0 and o’>O.

Consequently, the use of a systematic design is certainly superior to uniform ran-

domization. Thus, if D-optimality is the criterion of interest and if we suspect model

violations, then the randomization strategy to be adopted is far from clear.

3. Appendix

Proof of Lemma 1. In the proof, I,, 1, will denote the c x c identity matrix and the

cx 1 vector of ones. We shall also write J, = 1,li. When the dimensions are ob-

vious, we shall denote these quantities by I, 1 and J.

Since Fdy is an unbiased estimate of Lz under (2.1), we get

FdXB = 0 and Fd(Ib 0 Xj) = 0. (3.1)

To prove the lemma, note that permuting the blocks is equivalent to applying the

permutation matrix PO 0 I (where PO is a b x b permutation matrix). Permuting the

plots within the i-th block amounts to applying the permutation matrix q, @

Pi2 @ ... @Pin (where Pu is an ljx lj permutation matrix). Thus, permutation of the

plots within the blocks can be achieved by using the permutation matrix diag(P,, 0

.*. 0 P,,, . . ..Pbl@ ... @ Pbn). It is easy to verify that, for a bk x bk real symmetric

matrix A,

A, A, 0.. A2

(3.2)

where A, and A2 are k x k real symmetric matrices. We now have to evaluate


(4

(b)

TO evaluate (a), write the matrix in (a) as

It is easily verified that

c (ZO P,,)A,(ZO 6,) = B, 0 (a,,Z+ bt,,J) pi,,

for some matrix B,. Evaluation of (a) now reduces to evaluating

c (ZOpl,,~,)B,(ZOPI,-,)O(a,,Z+b,,J) PI,.-1

and continuing this process. Thus the matrix in (a) simplifies to A(A Ci C;,

where A(A) is a scalar depending only on A and each Ci is a scalar multiple of the

Kronecker product of Z and J matrices, with at least one .Z matrix. Similarly (b)

simplifies to A,(A)J.

Using these observations and (3.2), it follows that for a permutation matrix P in

Z&, Cp PAP’ has diagonal blocks A(A)Z+ xi Cj and off-diagonal blocks A1(A)J.

Hence

c PAP’=Zb@ A(A) c C;-A,(A)J, I 1 +Jb@A1(A)Jk.

P i

Using (3.1) we get Fd(Zb @ C;)=O, Fd(Zb@Jk)=O and Fd(Jb@Jk) =O. Hence

Fd( Cp PAP’)F&= ,i(A)F,F&, where P is a permutation as mentioned in the lemma.

This concludes the proof. 0

For a block design with b blocks of k plots each, we now give an expression for

A(A). If C, and @, denote the C matrix and the vector of adjusted treatment totals

for a block design d, then there exists a matrix Bd satisfying B,Bi=Cd and

Bdy=&. Since f=C’,‘pd=Fdy (where F,=CiB,), F,AFi=Cd+BdAB$d+. Then

+ W(d,s) = Cd+B&* V+gg’)B& . (3.3)

From Lemma 1, it follows that

c C,+ B,PAP’B;C,+ = rl(A)C; BdB;C,t = l(A)C,t . (3.4) P

Writing A =(au) and by doing the necessary algebraic computations, it can be

shown that b-l

4A)=(b-1)!(k!)bP1(k-2)! u;O (k-1); @,k+;,uk+; - f Gk+i,uk+j . (3.5) i=l i+j I


If A =a2Z+gg’, (3.5) simplifies to

A(A) = (b-l)!(k!)b~l(k-2)![bk(k-l)a2+~(g)] (3.6)

where b-l

Ak) = C C kuk+i-guk+j)2. (3.7) u=O i<j

From (3.4), (3.6) and (3.7) it follows that, for a block design, under the model

(2.2) with V=a2Z. 1

bk(k- 1) a) G. 1 (3.8)

Acknowledgement

The authors are grateful to a reviewer for pointing out an error in an original

draft of the paper. Detailed comments by a referee resulted in the elimination of

some errors and improved presentation.

References

Cheng, C.S. (1978). Optimality of certain asymmetrical experimental designs. Ann. Statist. 6,

1239-1261.

Folks, J.L. (1984). Use of randomization in experimental reseach. In: K. Hinkelmann, Ed., Experimen-

tal Design, Statistical Models and Genetic Statistics. Marcel Dekker, New York, 17-32.

Hedayat, A.S. and D. Majumdar (1985). Families of A-optimal block designs for comparing test

treatments with a control. Ann. Statist. 13, 757-767.

Kempthorne, 0. (1975). Inference from experiments and randomization. In: J.N. Srivastava, Ed., A

Survey of Statistical Design and Linear Mode/s. North-Holland, Amsterdam, 303-33 1,

Kempthorne, 0. (1977). Why randomize? J. Statist. Plann. Inference 1, l-25.

Kiefer, J. (1975). On the construction and optimality of generalized Youden designs. In: J.N. Srivastava,

Ed., A Survey of Statistical Design and Linear Models. North-Holland, Amsterdam, 333-353.

Kiefer, J. and H.P. Wynn (1981). Optimum balanced block and Latin square designs for correlated

observations. Ann Statist. 9, 737-757.

Li, K.C. (1983). Minimaxity for randomized designs: some general results. Ann. Statist. 11, 2255239.

Majumdar, D. and W.I. Notz (1983). Optimal incomplete block designs for comparing treatments with

a control. Ann. Staiist. 11, 258-266.

Srivastava, J. (1984). Sensitivity and revealing power: two fundamental statistical criteria other than op-

timality arising in discrete experimentation, In: K. Hinkelmann, Ed., Experirnenfal Design, Statistical

Models and Generic Statistics. Marcel Dekker, New York, 95-l 17.

WU, C.F. (1981). On the robustness and efficiency of some randomized designs. Ann. Statist. 9,

1168-1177.

the model-robustness and optimality of randomized designs

Documents