the model-robustness and optimality of randomized designs
TRANSCRIPT
Journal of Statistical Planning and Inference 23 (1989) 371-379
North-Holland
371
THE MODEL-ROBUSTNESS AND OPTIMALITY OF RANDOMIZED DESIGNS
Thomas MATHEW and Dulal Kumar BHAUMIK
University of Maryland Baltimore County, Catonsville, MD 21228, U.S.A
Received 23 October 1987; revised manuscript received 29 August 1988
Recommended by C.S. Cheng
Abstract: The concept of model-robustness introduced by Wu (1981) is investigated for a class
of designs relative to a general optimality criterion, in the setup of Li (1983). A minimaxity result
is established, which demonstrates that, for certain types of model inadequacies, randomization
validates model based optimality results for many designs, including block designs and row-
column designs. The class of optimality criteria considered includes the A- and E-criteria, but not
the D-criterion. A counter-example shows that randomization does not guarantee model-
robustness of the D-optimality property.
AMS Subject Classification: Primary 62K05; secondary 62C20, 62F35
Key words and phrases: Y-optimality; uniform randomization; model-robustness; minimaxity.
1. Introduction and summary
In the context of comparative experiments, a concept of model-robustness has
been introduced by Wu (1981). The results of Wu (1981) and their generalizations
by Li (1983) demonstrate that the use of randomization in the design of such ex-
periments can guarantee robustness against certain types of model inadequacies.
Even though the role of randomization in design of experiments is well recognized
in literature, a rigorous theoretical investigation of the model-robustness aspect of
randomization was undertaken for the first time by the above authors. In the con-
text of optimal designs, their results show that the A-optimality property of many
designs is robust against certain model violations, provided randomization was
employed while designing the experiment. For survey articles dealing with ran-
domization, we refer to Kempthorne (1975, 1977) and Folks (1984).
In the theory of optimal designs, various optimality criteria have been introduced
to judge the ‘goodness’ of a design for a specific inference problem. Some authors
(Kempthorne (1977), Srivastava (1984)) have questioned such optimality considera-
tions, since simple additive models (model (2.1) in the next section) may not always
adequately explain the variability among the treatments and the experimental units
in practical situations. Furthermore, the experimental errors may behave differently
0378.3758/89/$3.50 0 1989, Elsevier Science Publishers B.V. (North-Holland)
312 T. Mathew, D.K. Bhaumik / Model-robustness of designs
from one experimental unit to the other. Thus, they may have different variances
and may be correlated. Model (2.2) in the next section takes into account the above
inadequacies in (2.1). Since the exact nature of model violations is usually unknown
to us, (2.2) actually specifies a class of models, one of them being the correct model.
Thus, in Section 2, we investigate the performance of an optimal design (computed
under (2.1)), when (2.2) is, in fact, the correct model, for a class of optimality
criteria. Our main result in this section (Theorem 2) is that a design that is optimal
under an inadequate model is minimax or model-robust (as defined later) under the
correct model, provided randomization was employed while designing the experi-
ment. The result is established in the set up of Li (1983) and consequently applies
to single factor designs, block designs and row-column designs.
The class of optimality criteria that we have considered includes the familiar A-
and E-criteria, but not the D-criterion. It turns out that randomization does not en-
sure minimaxity (or model robustness) as far as D-optimality is concerned. We show
this by considering a BIBD, in the set up of Theorem 2. It is shown that, for D-
optimality, the given non-randomized BIBD is superior to a randomization of the
same. This counter-example shows that randomization may or may not guarantee
robustness against model inadequacies depending on the criterion employed. Thus,
a rigorous theoretical justification of randomization, as done by Wu (1981) and Li
(1983), is indeed necessary.
2. Minimax randomized designs under a general optimality criterion
Our set up is the same as that of Theorem 3.2 in Li (1983). Thus, for comparing
u treatments, the design under consideration has b blocks and the arrangement of
the units in the blocks is based on n factors. We assume that the number of levels
of the i-th factor is Ii in any block and the number of units in any block is
k= ny= 1 lj. The arrangement of units is such that each block forms a hyper-
rectangle of size Ii x f, X ... x I,, when n 2 2. When n = 1, the k= I, units are
assumed to be of the same level, in each block. Thus the total number of experimen-
tal units in the design is N= bk. We will use the symbol d to denote such a design.
If y, is the yield from the u-th unit (assumed to be in the s-th block), the usual
linear model for analyzing such a design is
n
Yu=go+Q(u)+&+ j;l Y{Z . . . . I; ,..., in)+%.
Here go is an overall mean (independent of u), 7dsd(ul is the effect of the treat-
ment which appears in unit u (for the design d), & is the s-th block effect and
y{;!,..,,.,i,, is the interaction effect of all but the j-th factor in block s at levels . .
’ il,...,lj_l, Ij+l,...,I,. Here, all the effects are assumed to be fixed. The E,‘S are
assumed to be zero mean uncorrelated random variables having a common variance
02. As pointed out by Li (1983), the models for single factor designs, block designs
T. Mathew, D.K. Bhaumik / Model-robustness of designs 373
and row-column designs are special cases of the above model. In matrix notation,
the model can be written as
Y=gol,,+&~+XpP+ E ubOXj)Yj+~ (2.1) j=l
where y is the vector of observations, lb, is a bk x 1 vector of ones, X, and XD are
the respective design matrices for the treatment effects T and block effects /? and
Zb@Xj is the design matrix for the vector of interaction effects of all but the j-th
factor in the various blocks (denoted by rj). We note that
Xp=Z,@lk and Xj=Z,,O ... @Zb_,@l,,@Zbl,@ ... @Zln.
If the assumptions about the 8,‘s are in suspect, then the E,‘S may have nonzero
means and different variances depending on u and may be correlated. The model
in this case is
Y=g+xTZ+Xflp+ i (Zb@Xj)Yj+&* j=l
(2.2)
Here g is an Nx 1 vector (N= bk) of unknown parameters. We now assume that
E has zero mean and covariance matrix o2 I/. It is further assumed that ge F? and
P’E “Y, where %J and Vare respectively compact subsets of RN and the set of real
symmetric matrices. It is also assumed that $4 contains a vector with all components
equal and “Ycontains the identity matrix. In other words, (2.1) is a possible model.
In what follows (2.1) and (2.2) will refer to the above models along with the cor-
responding assumptions.
We impose a certain invariance condition of F? and Vwhich essentially reflects
our ignorance about the exact nature of model violations. Let ZZt denote the per-
mutation group on li symbols which permute the Ii levels of the i-th factor, when
they occur in block s and let Hb denote the permutation group on { 1,2, . . . , b}. A
permutation in Hb permutes the blocks. The group generated by all the Hf and Hb
will be denoted by H,,. An element in Ho is a permutation on N symbols where
N= bk, the total number of experimental units. The invariance condition on $i? and
V, as described in Li (1983), is the following:
(A) If rrO~ Ho, ge $3 and I/= ((Q)) E W, then rcogE 9 and rco I/E “Y, where
(7~0 I’)G=~,,l;,~olj and (nog)i=g,,I;.
Throughout this paper we will be referring to the above condition as assumption
(A). We note that (2.2) in conjunction with assumption (A) specifies a class of
models, only one member in this class being the correct model.
Given a design d, let SB denote the set of all possible permutations of the units
in d according to the permutations in Ho. Any design in CB will be referred to as a
permutation of d. A randomization of d is a design selected from S7 according to
a probability distribution. Thus, we can denote a randomized design by the pair
314 T. Mathew, D.K. Bhaumik / Model-robustness of designs
(~,d), where rl is a probability distribution on g. According to this notation, if
di E 9, then (~,d) and (q,dr) denote the same design. Also, 4 will denote the
uniform distribution on GB. The randomization of d according to rf will be referred
to as the uniform randomization of d.
The inference problem under consideration is the estimation of Lz, where z is the
o-component vector of treatment effects and L is a known matrix. Lz is assumed
to be estimable under the model (2.1). Let Fdy denote the least squares estimator
of LT under (2.1), when design d is used. If the model (2.1) is correct, then the
dispersion matrix of Fdy is 02FdF& and according to optimal design theory, we
should choose a design d that minimizes the scalar valued function Y(F,F&). The
function Y, defined on the set of nonnegative-definite matrices, is assumed to
satisfy the following conditions:
(i) (ii)
(iii)
Y is convex;
Y is nondecreasing, i.e., if El, E2 and E, - E2 are nonnegative
definite, then Y(E,) 2 Y(E,); (2.3) if Y(E,) 2 Y(E,) for E, , E2 nonnegative definite, then
Y(aE,) 2 Y(aE2) for a > 0.
The above optimality criterion is similar to the weak universal optimality criterion
mentioned in Kiefer and Wynn (1981, p. 740). As observed by these authors, the
criteria covered by this definition include the familiar A- and E-optimality criteria
(and some other criteria), but not the D-optimality criterion. A design which
minimizes Y(FdF&), for any such Y, will be referred to as a Y-optimal design.
Clearly, randomization has no role in such optimality considerations if (2.1) is the
correct model.
We shall now try to assess the performance of the estimator Fdy (from the point
of view of Y-optimality) when model violations are present (more specifically, when
the model (2.2) holds). Since Fdy is no longer an unbiased estimator of LT, we
compute the matrix of mean squared errors W(d, S) = E(F, y - Lz)(F, y - Lz)‘,
where we write s = (g, V) E FJ x %‘= 9. We note that the matrix W(d, S) depends only
on the parameters in s and not on the other parameters in (2.2), since F,y is un-
biased for Lz under (2.1). The performance of a randomized strategy (q,d) is
evaluated using M(q, d,s) = E,, W(d,s). We note that M(q, d,s) also involves cr2,
since, under (2.2), COV(E) = o2 V, VE W. We do not specify this in M(q,d,s), since
a2 does not play any role in our results.
Definition. The randomized design (q *, d *) is said to be minimax or model-robust if
F; mEa; Y(M(rl, d, s)) = mEa; Y(M(r *, d *, s))
for any Y satisfying (2.3) and for every o’>O.
It should be mentioned that the criterion used by Li (1983) is similar to
E,, [ Y( W(d, s))], which is different from our criterion in the above definition. Since,
7. Mathew, D.K. Bhaumik / Model-robustness of designs 375
in general, there does not exist a randomized design which ‘minimizes’ the matrix
M(q, d, s) for any s (in the usual Loewner ordering of nonnegative definite matrices),
we are trying to minimize a scalar valued function of the above matrix, as is usually
done in optimal design theory. However, with our criterion, it does not appear
possible to prove a result as general as Theorem 2.1 of Li (1983). But the model-
robustness problem can be settled for a general optimality criterion satisfying (2.3).
The theorem proved below is similar to (but not a special case of) Theorem 2.1 in
Li (1983) or Proposition 1 in Wu (1981). We also mention that the theorem is not
true for the D-optimality criterion.
For s = (g, V) E 9 and rco E Ho, we write rcos = (rrog, rro V). Then assumption (A)
implies nos E 9 whenever s E 9.
Theorem 1. Suppose assumption (A) holds and Y satisfies (2.3). Then, for any design d and any probability distribution q on g.
yGa; Y(Mr7, d, s)) 5 max Y(WI~, d, s)), ,c s t 9‘)
where rf is the uniform distribution on g.
Proof. Using the convexity of Y, the proof is essentially an imitation of the proof
of Proposition 1 in Wu (198 1). For rco E Ho and for any probability distribution q
on S?, define the probability distribution qno as q,,(d,) =q(zOdl), for dl E 97. Then
ul=h,’ C~,~EH,,YI~ “, where ho is the order of Ho. By using the linearity of M(v],d,s) in q and the convexity of Y, we get
mEa; Y(M(rf, d, s)) 5 h’ C max Y(M(v~,, 4 s)) 0 II,, t N,) s E .v
- i. n”L” SG.‘y max Y(M(q, d, zos)) = max Y(M(q, d, s)).
st.i/
In order to conclude the last two equalities, we employed the facts
M(rln,, 4 s) =M(v?,,,, nod, 7~0s) = M(r, d, n,s) and
mEa; Y(M(r;l, d, ~0s)) = mEa; Y(M(uI, 4 s)),
using assumption (A). 0
The main theorem in this section is the following.
Theorem 2. Under the model (2. l), let Lr be an estimable parametric function and let do be a Y-optimal design for estimating Lz, for any Y satisfying (2.3). Then, under the model (2.2), if assumption (A) holds, the uniformly randomized design (?j, do) is minimax.
For the A-optimality criterion, Li (1983) has proved Theorem 2 in a very general
376 T. Mathew, D.K. Bhaumik / Model-robustness of designs
set up. In order to prove Theorem 2, we shall use the following lemma, which is
proved in the appendix.
Lemma 1. Let P be a permutation matrix representing a permutation in HO. Then for any real symmetric matrix A, Fd(Cp PAP’)F& = A(A)F,F&, where A(A) is a scalar not depending on d.
Proof of Theorem 2. The proof is based on arguments similar to those in the proof
of Theorem 3.2 in Li (1983, p.238). The desing do in the theorem minimizes
Y(F,Fi), where Fdy is the least squares estimator of LT. Under (2.2),
W(d,s) = E(Fdy-Lr)(Fdy-L@‘= Fd(02V+gg’)F&.
From Lemma 1, we get M(rf, d, s) = Ev W(d, s) = I,(s)F,F&, where &(s) is a scalar
not depending on d (&,(s) depends on a2).
From the proof of Lemma 1 in the appendix, it follows that for every 02, there
exists sO E S, satisfying max,,, &,(s) =&,(sr,). Applying Theorem 1 and using the
fact that Y satisfies (2.3), we get
= meat Y(lO(s)FdF$) = Y(ilO(sO)FdF&), using 2.3(ii)
2 ~(&(s,)&&,), using 2.3(iii) and the
Y-optimality of do
= yea: WW4,do, @I,
which completes the proof. q
Theorem 2 immediately gives the model-robustness of many well-known optimal
designs. Thus the A- and E-optimal block designs and Youden squares, for compar-
ing treatments, given in Kiefer (1975) and in Cheng (1978) are model-robust. So are
the A- and E-optimal designs, for comparing treatments with a control, given in Ma-
jumdar and Notz (1983) and Hedayat and Majumdar (1985). We now give a counter-example to show that uniform randomization does not
guarantee the model-robustness of the D-optimality property. For this, we consider
a BIBD for comparing 3 treatments A, B, C in 6 blocks as follows: blocks 4, 5 and
6 are the same as blocks 1, 2, and 3 respectively, given below.
Blockl: AB m Block 2: B C m
Block3: A C m
T. Mathew, D.K. Bhaumik / Model-robustness of designs 311
If all treatments are of equal interest and if there are no model violations, the D-
optimality criterion (for comparing the treatments) is the product of the nonzero
eigenvalues of C’ (the Moore-Penrose inverse of the C-matrix). Consider model-
violations of the following type: the set g consists of vectors of the form Ae;
(A # 0), where the e;‘s are the unit vectors in R l2 (the design has 12 plots) and W=
{ a21 : o2 > O}. Thus the experimental errors are uncorrelated random variables
having common variance a2, but the assumption that they have zero means is
violated. Clearly, with ?? and Yas defined above, assumption (A) is satisfied. For
a systematic design (i.e. non-randomized), the D-optimality criterion is the product
of the nonzero eigenvalues of Cd+Bd(02Z+gg’)B&d+ (see appendix for notations
and further details). If we use the above systematic BIBD with g=Ae; (for any i),
then using the expressions for B, and C$ for a BIBD, it follows that the D-
criterion has value +02(a2+hL2). But, if we employ uniform randomization,
then by using (3.8) in the appendix, it follows that the criterion has value +(02+
8A2)2, which is strictly greater than the previous value for any J. #0 and o’>O.
Consequently, the use of a systematic design is certainly superior to uniform ran-
domization. Thus, if D-optimality is the criterion of interest and if we suspect model
violations, then the randomization strategy to be adopted is far from clear.
3. Appendix
Proof of Lemma 1. In the proof, I,, 1, will denote the c x c identity matrix and the
cx 1 vector of ones. We shall also write J, = 1,li. When the dimensions are ob-
vious, we shall denote these quantities by I, 1 and J.
Since Fdy is an unbiased estimate of Lz under (2.1), we get
FdXB = 0 and Fd(Ib 0 Xj) = 0. (3.1)
To prove the lemma, note that permuting the blocks is equivalent to applying the
permutation matrix PO 0 I (where PO is a b x b permutation matrix). Permuting the
plots within the i-th block amounts to applying the permutation matrix q, @
Pi2 @ ... @Pin (where Pu is an ljx lj permutation matrix). Thus, permutation of the
plots within the blocks can be achieved by using the permutation matrix diag(P,, 0
.*. 0 P,,, . . ..Pbl@ ... @ Pbn). It is easy to verify that, for a bk x bk real symmetric
matrix A,
A, A, 0.. A2
(3.2)
where A, and A2 are k x k real symmetric matrices. We now have to evaluate
378 T. Mathew, D.K. Bhaumik / Model-robustness of designs
(4
(b)
TO evaluate (a), write the matrix in (a) as
It is easily verified that
c (ZO P,,)A,(ZO 6,) = B, 0 (a,,Z+ bt,,J) pi,,
for some matrix B,. Evaluation of (a) now reduces to evaluating
c (ZOpl,,~,)B,(ZOPI,-,)O(a,,Z+b,,J) PI,.-1
and continuing this process. Thus the matrix in (a) simplifies to A(A Ci C;,
where A(A) is a scalar depending only on A and each Ci is a scalar multiple of the
Kronecker product of Z and J matrices, with at least one .Z matrix. Similarly (b)
simplifies to A,(A)J.
Using these observations and (3.2), it follows that for a permutation matrix P in
Z&, Cp PAP’ has diagonal blocks A(A)Z+ xi Cj and off-diagonal blocks A1(A)J.
Hence
c PAP’=Zb@ A(A) c C;-A,(A)J, I 1 +Jb@A1(A)Jk.
P i
Using (3.1) we get Fd(Zb @ C;)=O, Fd(Zb@Jk)=O and Fd(Jb@Jk) =O. Hence
Fd( Cp PAP’)F&= ,i(A)F,F&, where P is a permutation as mentioned in the lemma.
This concludes the proof. 0
For a block design with b blocks of k plots each, we now give an expression for
A(A). If C, and @, denote the C matrix and the vector of adjusted treatment totals
for a block design d, then there exists a matrix Bd satisfying B,Bi=Cd and
Bdy=&. Since f=C’,‘pd=Fdy (where F,=CiB,), F,AFi=Cd+BdAB$d+. Then
+ W(d,s) = Cd+B&* V+gg’)B& . (3.3)
From Lemma 1, it follows that
c C,+ B,PAP’B;C,+ = rl(A)C; BdB;C,t = l(A)C,t . (3.4) P
Writing A =(au) and by doing the necessary algebraic computations, it can be
shown that b-l
4A)=(b-1)!(k!)bP1(k-2)! u;O (k-1); @,k+;,uk+; - f Gk+i,uk+j . (3.5) i=l i+j I
T. Mathew, D.K. Bhaumik / Model-robustness of designs 379
If A =a2Z+gg’, (3.5) simplifies to
A(A) = (b-l)!(k!)b~l(k-2)![bk(k-l)a2+~(g)] (3.6)
where b-l
Ak) = C C kuk+i-guk+j)2. (3.7) u=O i<j
From (3.4), (3.6) and (3.7) it follows that, for a block design, under the model
(2.2) with V=a2Z. 1
bk(k- 1) a) G. 1 (3.8)
Acknowledgement
The authors are grateful to a reviewer for pointing out an error in an original
draft of the paper. Detailed comments by a referee resulted in the elimination of
some errors and improved presentation.
References
Cheng, C.S. (1978). Optimality of certain asymmetrical experimental designs. Ann. Statist. 6,
1239-1261.
Folks, J.L. (1984). Use of randomization in experimental reseach. In: K. Hinkelmann, Ed., Experimen-
tal Design, Statistical Models and Genetic Statistics. Marcel Dekker, New York, 17-32.
Hedayat, A.S. and D. Majumdar (1985). Families of A-optimal block designs for comparing test
treatments with a control. Ann. Statist. 13, 757-767.
Kempthorne, 0. (1975). Inference from experiments and randomization. In: J.N. Srivastava, Ed., A
Survey of Statistical Design and Linear Mode/s. North-Holland, Amsterdam, 303-33 1,
Kempthorne, 0. (1977). Why randomize? J. Statist. Plann. Inference 1, l-25.
Kiefer, J. (1975). On the construction and optimality of generalized Youden designs. In: J.N. Srivastava,
Ed., A Survey of Statistical Design and Linear Models. North-Holland, Amsterdam, 333-353.
Kiefer, J. and H.P. Wynn (1981). Optimum balanced block and Latin square designs for correlated
observations. Ann Statist. 9, 737-757.
Li, K.C. (1983). Minimaxity for randomized designs: some general results. Ann. Statist. 11, 2255239.
Majumdar, D. and W.I. Notz (1983). Optimal incomplete block designs for comparing treatments with
a control. Ann. Staiist. 11, 258-266.
Srivastava, J. (1984). Sensitivity and revealing power: two fundamental statistical criteria other than op-
timality arising in discrete experimentation, In: K. Hinkelmann, Ed., Experirnenfal Design, Statistical
Models and Generic Statistics. Marcel Dekker, New York, 95-l 17.
WU, C.F. (1981). On the robustness and efficiency of some randomized designs. Ann. Statist. 9,
1168-1177.