tabu search for covering arrays using permutation vectors

12
Journal of Statistical Planning and Inference 139 (2009) 69 -- 80 Contents lists available at ScienceDirect Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi Tabu search for covering arrays using permutation vectors Robert A. Walker II, Charles J. Colbourn Computer Science and Engineering, Arizona State University, P.O. Box 878809, Tempe, AZ 85287, USA ARTICLE INFO ABSTRACT Available online 27 May 2008 Keywords: Covering array Orthogonal array Permutation vector Tabu search Heuristic search A covering array CA(N; t, k, v) is an N × k array, in which in every N × t subarray, each of the v t possible t-tuples over v symbols occurs at least once. The parameter t is the strength of the array. Covering arrays have a wide range of applications for experimental screening designs, particularly for software interaction testing. A compact representation of certain covering ar- rays employs “permutation vectors” to encode v t × 1 subarrays of the covering array so that a covering perfect hash family whose entries correspond to permutation vectors yields a cov- ering array. We introduce a method for effective search for covering arrays of this type using tabu search. Using this technique, improved covering arrays of strength 3, 4 and 5 have been found, as well as the first arrays of strength 6 and 7 found by computational search. © 2008 Elsevier B.V. All rights reserved. 1. Introduction A covering array CA(N; t, k, v) is an N × k array in which every subarray induced by a selection of t columns contains all possible t-tuples over v symbols. Fig. 1 shows a CA(13; 3, 10, 2). A CA(v t ; t, k, v) is an orthogonal array, denoted OA(t, k, v); in this case every t-tuple occurs exactly once. The smallest N for which a CA(N; t, k, v) exists is the covering array number, denoted CAN(t, k, v). Screening experiments are often used to indicate factors and levels that impact response; once such factors are identified, more detailed models can then be constructed to measure main effects and interactions. A particular case arises in testing a complex system for unexpected interactions; in experimental design, covering arrays arise primarily in this setting. Covering arrays have been the focus of much research, primarily due to their applications in software and hardware interaction testing. These applications are discussed in Cohen et al. (1997) and Colbourn (2004). Applications in biological sciences also arise (Shasha et al., 2001). Our focus is on construction techniques, rather than on the specific application to experimental design. Techniques used to construct covering arrays include recursive methods (for examples see Hartman and Raskin, 2004; Martirosyan and Van Trung, 2004; Sloane, 1993), algebraic methods (Chateauneuf and Kreher, 2002; Hedayat et al., 1999), and computational search such as in Cohen (2004, 2005) and Nurmela (2004). Recently, Sherwood et al. (2006) exploited a compact representation of covering arrays based on permutation vectors. When v is prime or a prime power, a covering perfect hash family CPHF(n; k, v t1 , t) is an n × k array on v t1 symbols such that every n × t subarray contains at least one row which is “covering” in the following sense. The v t1 symbols in a CPHF can be viewed as a (t 1)-tuple on v symbols. This (t 1)-tuple represents a permutation vector of length v t over the elements of the finite field F v . Given a (t 1)-tuple (h 1 , h 2 ,..., h t1 ) with h j ∈{0,1,..., v 1} for 1 j t 1, a permutation vector ( −−−−−−−−−−−→ h 1 , h 2 ,..., h t1 ) of length v t has the symbol (h t1 · (i) t1 ) +···+ (h 2 · (i) 2 ) + (h 1 · (i) 1 ) + (i) 0 in position i Corresponding author. Tel.: +1 480 727 6631; fax: +1 480 965 2751. E-mail addresses: [email protected] (R.A. Walker), [email protected] (C.J. Colbourn). 0378-3758/$ - see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2008.05.020

Upload: robert-a-walker-ii

Post on 26-Jun-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tabu search for covering arrays using permutation vectors

Journal of Statistical Planning and Inference 139 (2009) 69 -- 80

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference

journal homepage: www.e lsev ier .com/ locate / jsp i

Tabu search for covering arrays using permutation vectors

Robert A. Walker II, Charles J. Colbourn∗

Computer Science and Engineering, Arizona State University, P.O. Box 878809, Tempe, AZ 85287, USA

A R T I C L E I N F O A B S T R A C T

Available online 27 May 2008

Keywords:Covering arrayOrthogonal arrayPermutation vectorTabu searchHeuristic search

A covering array CA(N; t, k, v) is an N × k array, in which in every N × t subarray, each of thevt possible t-tuples over v symbols occurs at least once. The parameter t is the strength of thearray. Covering arrays have a wide range of applications for experimental screening designs,particularly for software interaction testing. A compact representation of certain covering ar-rays employs “permutation vectors” to encode vt × 1 subarrays of the covering array so thata covering perfect hash family whose entries correspond to permutation vectors yields a cov-ering array. We introduce a method for effective search for covering arrays of this type usingtabu search. Using this technique, improved covering arrays of strength 3, 4 and 5 have beenfound, as well as the first arrays of strength 6 and 7 found by computational search.

© 2008 Elsevier B.V. All rights reserved.

1. Introduction

A covering arrayCA(N; t, k, v) is anN×k array in which every subarray induced by a selection of t columns contains all possiblet-tuples over v symbols. Fig. 1 shows a CA(13; 3, 10, 2). A CA(vt; t, k, v) is an orthogonal array, denoted OA(t, k, v); in this caseevery t-tuple occurs exactly once. The smallestN for which aCA(N; t, k, v) exists is the covering array number, denotedCAN(t, k, v).

Screening experiments are often used to indicate factors and levels that impact response; once such factors are identified,more detailed models can then be constructed to measure main effects and interactions. A particular case arises in testing acomplex system for unexpected interactions; in experimental design, covering arrays arise primarily in this setting. Coveringarrays have been the focus of much research, primarily due to their applications in software and hardware interaction testing.These applications are discussed in Cohen et al. (1997) and Colbourn (2004). Applications in biological sciences also arise (Shashaet al., 2001).

Our focus is on construction techniques, rather than on the specific application to experimental design. Techniques used toconstruct covering arrays include recursive methods (for examples see Hartman and Raskin, 2004; Martirosyan and Van Trung,2004; Sloane, 1993), algebraic methods (Chateauneuf and Kreher, 2002; Hedayat et al., 1999), and computational search suchas in Cohen (2004, 2005) and Nurmela (2004). Recently, Sherwood et al. (2006) exploited a compact representation of coveringarrays based on permutation vectors. When v is prime or a prime power, a covering perfect hash family CPHF(n; k, vt−1, t) is ann× k array on vt−1 symbols such that every n× t subarray contains at least one row which is “covering” in the following sense.

The vt−1 symbols in a CPHF can be viewed as a (t− 1)-tuple on v symbols. This (t− 1)-tuple represents a permutation vectorof length vt over the elements of the finite field Fv. Given a (t− 1)-tuple (h1,h2, . . . ,ht−1) with hj ∈ {0, 1, . . . , v− 1} for 1� j� t− 1,

a permutation vector (−−−−−−−−−−−→h1,h2, . . . ,ht−1) of length vt has the symbol (ht−1 · �(i)

t−1) + · · · + (h2 · �(i)2 ) + (h1 · �(i)

1 ) + �(i)0 in position i

∗ Corresponding author. Tel.: +14807276631; fax: +14809652751.E-mail addresses: [email protected] (R.A. Walker), [email protected] (C.J. Colbourn).

0378-3758/$ - see front matter © 2008 Elsevier B.V. All rights reserved.doi:10.1016/j.jspi.2008.05.020

Page 2: Tabu search for covering arrays using permutation vectors

70 R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80

Fig. 1. A CA(13; 3, 10, 2).

where i is represented in base v as i =∑t−1k=0v

k · �(i)k . A row is covering if the expansion of the permutation vectors into columns

results in an OA(t, t, v). When every symbol in a CPHF is expanded in this manner, the result is a covering array.When i < v, �(i)

k =0 for k >0. Hence, every permutation vector startswith the sequence 0, 1, . . . , v−1. Eliminating these duplicaterows leads to the key theorem of Sherwood et al. (2006):

Theorem 1.1. If v is a prime or a prime power, and a CPHF(n; k, vt−1, t) exists, then a CA(n · (vt − v)+ v; t, k, v) exists.

We typically omit the exponent, and refer to a CPHF(n; k, v, t) instead of a CPHF(n; k, vt−1, t).Using backtracking, Sherwood et al. (2006) found covering arrays for strengths 3 and 4 that improve upon other known

constructions. In this paper, we employ the permutation vector representation as the basis of a tabu search method. In this way,we find a number of improved covering arrays for strengths 3–5; more surprisingly, we find the first covering arrays of strength6 and 7 from computer search. We conclude by presenting the first existence tables for covering arrays of strength 5, partly todemonstrate the utility of the arrays found by the heuristic search method.

2. Forming CAs from CPHFs

In order to understand the construction underlying Theorem 1.1, we show the expansion of the following CPHF(2; 10, 3, 3)into a CA:

11 00 22 21 01 02 10 11 02 1210 01 11 11 00 22 01 02 20 12

Write each of the vt−1 symbols as a t−1 tuple on v symbols (in this case, the 32 symbols as 2-tuples on 3 symbols). To convertthe symbol 11 (h1 = 1,h2 = 1) into a vector of length 33 each row number i is written as a vt tuple. Hence for example i = 0 iswritten as i= 000 and i= 17 is written as i= 122. For row i= 000, the vector is assigned the value 0 · 1+ 0 · 1+ 0= 0. Continuingin this manner,

i= 001 : 0 · 1+ 0 · 1+ 1= 1i= 002 : 0 · 1+ 0 · 1+ 2= 2i= 010 : 0 · 1+ 1 · 1+ 0= 1i= 011 : 0 · 1+ 1 · 1+ 1= 2i= 012 : 0 · 1+ 1 · 1+ 2= 0i= 020 : 0 · 1+ 2 · 1+ 0= 2i= 021 : 0 · 1+ 2 · 1+ 1= 0i= 022 : 0 · 1+ 2 · 1+ 2= 1...i= 212 : 2 · 2+ 1 · 1+ 2= 1i= 220 : 2 · 2+ 2 · 1+ 0= 0i= 221 : 2 · 2+ 2 · 1+ 1= 1i= 222 : 2 · 2+ 2 · 1+ 2= 2

Page 3: Tabu search for covering arrays using permutation vectors

R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80 71

Fig. 2. A CA(54; 3, 10, 3)T.

Expanding each symbol of the CPHF in this manner, the CA(54; 3, 10, 3) shown transposed in Fig. 2 is obtained. The first v rows(columns as shown) of every permutation vector are the same, in this case 0, 1, 2. We only need to use one copy of each. So wereduce the CA(54; 3, 10, 3) to a CA(51; 3, 10, 3). This process is described in more detail in Sherwood et al. (2006).

3. Mathematical preliminaries

We first discuss how to determine whether a set of permutation vectors is covering. Consider t permutation vectors:

(−−−−−−−−−−−−−→h(1)1 ,h(1)2 , . . . ,h(1)t−1), (

−−−−−−−−−−−−−→h(2)1 ,h(2)2 , . . . ,h(2)t−1), . . . , (

−−−−−−−−−−−−→h(t)1 ,h(t)2 , . . . ,h(t)t−1)

This set of permutation vectors is covering if its expansion into a vt × t array is an orthogonal array. To check if this conditionis notmet, we check to see if the array contains some t-tuple twice as a row. Hence, a set of permutation vectors is non-coveringif and only if we can find distinct i, j ∈ {0, 1, . . . , vt − 1} so that

�(i)0 + (h(1)1 · �

(i)1 )+ (h(1)2 · �

(i)2 )+ · · · + (h(1)t−1 · �

(i)t−1)= �(j)

0 + (h(1)1 · �(j)1 )+ (h(1)2 · �

(j)2 )+ · · · + (h(1)t−1 · �

(j)t−1)

�(i)0 + (h(2)1 · �

(i)1 )+ (h(2)2 · �

(i)2 )+ · · · + (h(2)t−1 · �

(i)t−1)= �(j)

0 + (h(2)1 · �(j)1 )+ (h(2)2 · �

(j)2 )+ · · · + (h(2)t−1 · �

(j)t−1)

...

�(i)0 + (h(t)1 · �

(i)1 )+ (h(t)2 · �

(i)2 )+ · · · + (h(t)t−1 · �

(i)t−1)= �(j)

0 + (h(t)1 · �(j)1 )+ (h(t)2 · �

(j)2 )+ · · · + (h(t)t−1 · �

(j)t−1)

(1)

Write �r = �(i)r − �(j)

r for 0� r� t − 1 with fixed i and j. Then rewrite (1) as

�0 + (h(1)1 · �1)+ (h(1)2 · �2)+ · · · + (h(1)t−1 · �t−1)= 0

�0 + (h(2)1 · �1)+ (h(2)2 · �2)+ · · · + (h(2)t−1 · �t−1)= 0

...

�0 + (h(t)1 · �1)+ (h(t)2 · �2)+ · · · + (h(t)t−1 · �t−1)= 0

(2)

The set of permutation vectors is non-covering if and only if there exist {�r : 0� r� t − 1}, with �i �= 0 for at least one i, thatsolve the system of linear equations (2). Ass an example, consider the array:

CPHF(3;22,3,3)11 11 21 10 21 22 02 20 12 01 00 01 11 10 00 02 00 20 12 10 21 2220 21 22 21 11 01 01 11 01 00 20 22 10 12 12 02 02 20 10 00 10 1101 22 21 00 01 02 12 10 10 22 21 12 02 12 22 20 00 11 20 10 11 20

Each symbol represents a permutation vector. For instance, the symbol 01 represents the vector with h1 = 0,h2 = 1. For theexample, the field F3 is simply Z3.

Page 4: Tabu search for covering arrays using permutation vectors

72 R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80

Let us consider the first three columns of this array. The first row contains 11 11 21. This row is non-covering due to theduplication of the vector represented by 11. The second row contains 20 21 22. For this set of vectors, a solution to (2) is givenby (�0,�2,�2)= (1, 1, 0) (arithmetic in Z3).

1+ 1 · 2+ 0 · 0= 1+ 2= 0

1+ 1 · 2+ 0 · 1= 1+ 2= 0

1+ 1 · 2+ 0 · 2= 1+ 2= 0

Hence the second row yields a distinct non-covering tuple. However, the third row, 01 22 21, is covering because there is nochoice of (�0,�2,�2) to solve (2). Because this row is covering, it expands into an orthogonal array, shown transposed:

⎡⎣0 1 2 0 1 2 0 1 2 1 2 0 1 2 0 1 2 0 2 0 1 2 0 1 2 0 10 1 2 2 0 1 1 2 0 2 0 1 1 2 0 0 1 2 1 2 0 0 1 2 2 0 10 1 2 2 0 1 1 2 0 1 2 0 0 1 2 2 0 1 2 0 1 1 2 0 0 1 2

⎤⎦

We include the three constant rows (columns as shown) only once each. Hence the CPHF(3; 22, 3, 3) yields a CA(3 · (33 − 3)+ 3= 75; 3, 22, 3) and establishes the bound CAN(3, 22, 3)�75.

4. Tabu search

Nurmela's tabu search results for covering arrays (Nurmela, 2004) treat the actual covering array. Our tabu search methodsearches instead for the covering perfect hash family in order to reduce computation time significantly. We maintain a currentcandidate array with an associated score. We also maintain a list of recent states as a tabu list. We then generate a neighborhoodof moves. We choose from these the move with the best score that does not take us into a state in the tabu list. More informationon the tabu search method can be found in Glover and Laguna (1997).

The score S of a given candidate array is the number of sets of t columns that have no covering row; such a set of columns isuncovered, and each column in an uncovered set is deficient. By definition, 0�S� ( kt ) and an array with score S= 0 is a CPHF.

Amove changes one element of the array to a new value. Not all potential moves need to be considered. Changing an elementwithin a column that is not deficient can have no positive effect. Therefore, we limit the neighborhood of moves to deficientcolumns. We denote the number of deficient columns as D, so that D�k and D�S · t.

There are nD(vt−1 − 1) moves to consider. For each, we compute the score the new array would have. We cache informationabout which set(s) of columns are covered by which row(s) and thus consider only the ( k−1t−1 ) sets of columns that contain theelement being changed. This is slightly less efficient than the “cost change table” discussed in Fleurant and Ferland (1996), butemploys similar ideas and uses lessmemory. In the event that two ormoremoves result in the same best score, we choose amongthem randomly.

We maintain a tabu list of the last 50,000 moves made. This number was chosen after extensive experimentation; larger tabulists did not improve the results in our cases. Using this list, we are able to generate a list of moves that take the current arrayback to a tabu array. This is discussed in more detail in Section 4.1.

We start the search with a randomly generated array. At the beginning of the search, we usually have D = k, which leads toa large number of moves to consider. It is therefore helpful to restrict the neighborhood examined. To do this, we select onecolumn in a weighted random fashion, where the weight of each column is the number of non-covering sets to which it belongs.By restricting the neighborhood to changes within this column, we can increase the speed by a factor of D with only a minordecrease in search effectiveness. Once D becomes small (ideally D<k), we remove this neighborhood restriction. When D doesnot reach the threshold to be considered small, we turn the neighborhood restriction off when the search appears to “stall”.

4.1. The tabu list

Tabu search relies on keeping a fixed-length list of recent states known as a tabu list. The search program is then prohibitedfrom revisiting these states. This technique helps to prevent cycles.

For large arrays, storing a long list of arrays is prohibitive both for memory and computation time, when one must compareeach target array with every array in this list. Nurmela (2004) employs a general strategy discussed in Glover and Laguna (1997)in order to simplify the tabu condition. Instead of storing a long list of recent arrays, Nurmela stores a short list of recentpositions that were modified. Changes to these positions are prohibited until enough moves have occurred that they are nolonger considered tabu. This technique prevents revisiting any recent states; however, it can disallow moves that should not betabu. The size of the tabu list with this techniquemust be shorter than the size of exact tabu lists; otherwise it is possible to makeevery move tabu.

In the search, we keep an undo list of recent moves on which a move is specified by a row, a column, and the value that wasreplaced. At each iteration of the search, we can convert this to a list of moves taking the current array back to a tabu state.To accomplish this, we use Algorithm 1. It works by keeping a list of differences between the old array and the current array.

Page 5: Tabu search for covering arrays using permutation vectors

R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80 73

Whenever there is exactly one position that differs, we declare changing that position to the value it held previously to be atabu move.

Algorithm 1. Convert undo list into tabu list.StateCount← 0Changes← ∅TabuMoves← ∅for Undo ∈ UndoHistory from most recent to oldest do

if CurrentArray[Undo.Row][Undo.Col]= Undo.OldValue thenRemove [Undo.Row][Undo.Col] from ChangesStateCount← StateCount− 1

elseif [Undo.Row,Undo.Col] in Changes then

Changes[Undo.Row][Undo.Col]= Undo.OldValueelse

Insert [Undo.Row][Undo.Col]= Undo.OldValue into ChangesStateCount← StateCount+ 1

end ifend ifif StateCount= 1 then

Insert Changes[1] into TabuMovesend if

end for

Using this more exact tabu list resulted in much faster searches and better results than using the technique given by Nurmela.This algorithm can be used for any tabu search that operates on arrays. Similar techniques are discussed in much more detail inGlover and Laguna (1997); in particular, the method of “reverse elimination” discussed therein employs logical structure to infertabu moves rather than storing them explicitly, as we do here.

4.2. The non-covering test

The search program very frequently needs to test whether a given set of permutation vectors is covering or non-covering.Since this test comprises the most executed portion of the program, it is important for it to run as fast as possible.

Recall that a given set of permutation vectors is non-covering if and only if there exist non-zero �r that solve the equation

�0 + (h(i)1 × �1)+ (h(i)2 × �2)+ · · · + (h(i)t−1 × �t−1)= 0 (3)

for all h(i) in the set. It is possible to perform a lot of this work before the search begins. We think of it in reverse, as follows. Givennon-zero values �r there are vt−2 permutation vectors h(i) so that (3) holds (Sherwood et al., 2006). We can pre-compute theset of vectors that is “solved” by each assignment of the �r 's. Call this set V({�0,�1, . . . ,�t−1}). Then, a given set T of permutationvectors is non-covering if and only if there exists non-zero �r 's so that T ⊆ V({�0,�1, . . . ,�t−1}). Testing this condition is muchfaster than solving the linear system, and uses very little memory. In the worst case, this method runs in O(t · vt−1 · (t − 2) log v)time since there are t elements to test for inclusion in O(vt−1) sets using binary search on vt−2 elements. Because it uses so littlememory, we use this method for large t and v, specifically when v2(t−1) >5000000.

Another method is to compute the set V−1((h1,h2, . . . ,ht−1)) of �r 's that solves (2) for permutation vector (h1,h2, . . . ,ht−1).Then a given set T of permutation vectors is non-covering if and only if

h∈TV−1(h) �= ∅ (4)

We accelerate this by storing not only V−1(h) but also the values of V−1(h) ∩ V−1(k) for any two permutation vectors h andk. This uses significantly more space than the previous method. However, storing the V−1 sets in a sorted list let us check thenon-covering condition in O((t/2� − 1)vt−1) time. When t= 4 this is just O(v3) time. This method is used for t >4 and t= 4, v�5.

Fig. 3. A WCA(4; 8, 4).

Page 6: Tabu search for covering arrays using permutation vectors

74 R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80

For the fastest check, we simply store a single bit for every possible set of permutation vectors: 1 if it is covering, and 0 if itis not. This permits a constant time check, but requires O((vt−1)t) space, which is prohibitive for large t or v. We use this checkwhen t = 3 and when t = 4 and v�4.

4.3. Related search problems

The method developed for CPHF search can be applied to many similar search problems. Let C = {Ci : i = 1...�} be a set ofsubsets of tuples of the same length over an alphabet of size v. Let the length of the tuples in set Ci be denoted ti. AC-(N, k)-arrayis an N × k array with entries from the same alphabet of size v, in which every N × t subarray has the property that for everyi with 1� i��, there exists a row of the subarray equal to a t-tuple in Ci. No assumption is made that Ci and Cj are disjoint,nor that ti = tj, nor that a given tuple appears in any of the sets. We also define a C∗-array to have the same property when weonly consider N× t subarrays where the columns maintain their original ordering. AnyC-array can be formulated as aC∗-array;however, the converse is not necessarily true.

Taking C to be the set of vt singleton sets each containing a distinct t-tuple, a C∗-array is a covering array. If we consider allgroups of Ci whose elements are merely permutations of each other, and keep only one of each inC, aC-array is a covering array.TakingC to have a single set in which t-tuples with distinct entries appearmakes aC∗-array a perfect hash family, and restrictingthe set to contain only the covering t-tuples makes a C∗-array a CPHF. In general, the description of C is not an explicit listingof tuples; rather an oracle to test membership of a tuple in Ci is assumed.

The generic method of search has been used to find perfect hash families in Walker and Colbourn (2007). The search methodcan find a C∗-array for any set C, but we optimized it for the case when |C| = 1. To demonstrate the generic nature of thismethod, setC to be a single set C1 consisting of common 4-letter English words.We refer to this as aWCA:word covering array ofstrength 4, and Fig. 3 gives an example.

While the practical implications of such an array are likely non-existent, it is an effective demonstration of the generic natureof the algorithm. It is also interesting to see the kind of patterns “utilized” by the resulting array, since many of these samepatterns probably appear less visibly in CPHFs.

5. Results

Given n, v and t, the maximum value of k for which a CPHF was found is shown in Tables 1–5. Items marked with an asteriskappeared in Sherwood et al. (2006). All others were created with the method in this paper.

Most of the arrays in these tables were found by a 2.66GHz Pentium 4 in less than 10min. Some arrays with k columnsrequired more than a day of search; however, finding an array with k− 1 columns in these cases always took less than an hour.The most difficult arrays to find generally are those listed at the borders of the table, i.e. those with the highest k for a given vand t. This difficulty is based on the added size of the search space and the additionalmemory needed to process the non-coveringcondition.

Explicit solutions appear in Walker (2005). We give one array of each strength in Table 6.

5.1. Analysis and reduction

In order to assess the quality of the covering arrays generated, we implemented an analysis program that tests for flexibility inthe array. One form of flexibility is a “don't-care” position, a position whose value can be changed without affecting the coveringproperty. Of the 24 strength 3 arrays generated, only 3 have don't-care positions, and they occur in fewer than 1% of the positions.For strength 4, 5 arrays of 25 have don't-care positions, making up roughly 4.4% of the positions. For strength 5, 5 of 12 arrayshave don't-care positions, making up 4.5% of the positions. For strength 6, 7 of 9 arrays have flexible positions making up 8.7% ofthe positions. Finally, for strength 7, all 6 arrays have flexible positions making up 4.7% of the positions. The larger proportion of

Table 1Table for k, given n and v, where t = 3

v n

2 3 4 5 6 7

3 10 22 37 57 89 1424 16∗ 34 64 118 2225 24∗ 48 95 1607 32 81 1508 40∗ 91 2009 41 113 225

11 50 14613 59 200

Page 7: Tabu search for covering arrays using permutation vectors

R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80 75

Table 2Table for k, given n and v, where t = 4

v n

2 3 4 5 6 7

2 8∗ 9 123 10∗ 16∗ 23 30 39 504 13 20 31 425 15 24 35 627 18 30 548 20 369 21 39

11 2313 24

Table 3Table for k, given n and v, where t = 5

v n

2 3 4 5 6

2 10 12 14 16 173 10 13 16 194 11 15

Table 4Table for k, given n and v, where t = 6

v n

2 3 4 5 6

2 7 9 11 12 143 10 12 144 11

Table 5Table for k, given n and v, where t = 7

v n

2 3 4 5 6

2 9 11 12 12 133 12 12 13

don't-care positions as strength increases appears to be due to the additional complexity of searching for higher-strength arrays,in turn suggesting that better arrays are more likely to exist in these cases.

A higher degree of flexibility is indicated by the presence of a redundant row. This does not happen for any of the strength 3arrays. However, it happens once for strength 4, twice for strength 5, six times for strength 6, and five times for strength 7. Thisagain shows a strong trend of higher quality for lower strength. Removing these extra rows produces the following improvedarrays:

CA(54; 4, 12, 2) CA(110; 5, 14, 2) CA(176; 5, 17, 2) CA(96; 6, 7, 2)

CA(160; 6, 9, 2) CA(232; 6, 11, 2) CA(272; 6, 12, 2) CA(322; 6, 14, 2)

CA(1449; 6, 10, 3) CA(224; 7, 9, 2) CA(364; 7, 11, 2) CA(452; 7, 12, 2)

CA(572; 7, 13, 2) CA(4347; 7, 12, 3)

5.2. Recursive bound improvements

The results found are often the best known covering arrays with small k. Thus they have the additional value of being excellentingredients to recursive constructions. Establishing these claims is bynomeans straightforward, since to thebest of our knowledgeno one has tabulated values of CAN(5, k, v). We therefore developed a Maple environment to incorporate all of the known direct

Page 8: Tabu search for covering arrays using permutation vectors

76 R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80

Table 6An example for each strength

CPHF(2; 16,4,3)32 20 13 10 21 22 11 02 23 01 00 12 03 30 33 3100 23 11 12 02 13 30 03 10 01 32 31 22 33 20 21

CPHF(2; 10,3,4)122 100 020 202 112 222 201 000 212 120112 100 101 221 122 200 021 201 120 011

CPHF(3; 13,3,5)1001 1111 1002 0110 1012 1201 0000 2112 0210 2101 1110 2020 02210010 0101 2001 2200 0200 0121 0211 1220 2222 2112 1111 1112 01021021 2110 2010 0102 0012 2122 2111 0112 1212 0110 1022 0021 1001

CPHF(3; 9,2,6)00101 00000 01010 10101 01100 00110 10000 10100 1000111001 10100 10101 00000 11101 01111 11111 01011 0010110010 01001 01100 01011 01000 01110 11110 11010 10001

CPHF(2; 9,2,7)110000 101101 111101 100100 000111 101010 000010 101110 111011100000 111010 010001 110111 001111 111111 000001 111100 111011

Table 7Bounds on CAN(5, k, v) for 2� k�9

v k

10 25 50 100 250 500 1000 2500 5000 10000

2 62 287 392 644 1007 1464 2047 2744 3038 3926112 347 392 674 1037 1494 2077 2744 3614 4502

3 483 2007 3765 5997 8667 12 969 18 369 28 491 36 651 42 147

999 2689 4113 6749 10 517 15 247 21 103 31 994 41 278 51 471

4 2044 7396 13 656 20 440 41 245 57 663 78 713 110 443 143 562 153 132

3456 12 736 19 776 29 354 46 561 64 593 86 765 122 088 156 833 195 258

5 9875 31 735 49 325 71 095 113 089 156 847 210 737 303 763 387 913 454 755

9875 37 259 55 809 82 603 126 789 175 571 235 461 334 279 426 269 494 739

6 24 474 78 876 133 942 202 550 346 098 492 420 692 610 1 002 556 128 4624 1 441 860

24 474 82 196 149 102 226 140 366 853 520 135 729 525 1 074 301 1 364 799 1 460 130

7 48 363 104 433 179 417 268 765 461 015 648 259 899 415 1 334 711 1 686 031 2 146 675

48 363 114 729 212 429 311 893 517 157 712 753 985 653 1 479 395 1 840 831 2 319 151

8 59 048 191 112 332 696 520 977 806 834 1 124 979 1 509 029 2 296 905 2 951 596 3703 209

59 048 212 623 400 904 589 185 896 553 1 280 603 1 664 653 2 463 183 3 156 290 3 907 903

9 59 049 326 925 575 127 910 945 1 410 635 1 981 605 2 677 055 3 784 425 4 891 283 6 159 485

59 049 367 757 703 575 1 039 393 1 579 915 2 275 365 2 970 815 4 047 337 5 315 539 6 583 741

and recursive constructions for covering arrays of strengths 2–5. For strengths 3 and 4, these tables and the known constructionsappear in Colbourn et al. (2006). For strength 5, we use the Roux-type constructions of Martirosyan and Colbourn (2005)and Martirosyan and Van Trung (2004), the perfect hash family construction (Colbourn et al., 2006; Martirosyan and Van Trung,2004), the Turán squaring construction of Hartman (2005), as well as direct constructions. The direct constructions arise fromorthogonal arrays (Hedayat et al., 1999) and other computational search techniques (Cohen, 2004, 2005; Cohen et al., 2008;Nurmela, 2004).

Now to illustrate the impact of the new direct constructions, we present the best-known values for CAN(5, k, v) before andafter the CPHF constructions are included.

In Table 7, two upper bounds on CAN(5, k, v) are given for 2�v�9 and

k ∈ {10, 25, 50, 100, 250, 500, 1000, 2500, 5000, 10000}.

Page 9: Tabu search for covering arrays using permutation vectors

R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80 77

Table 8Bounds for �(N; 5, 2)

6 32o 8 56u 10 62v

12 92v 14 110v 16 152v

17 176v 20 194j 24 261i

28 287i 32 330i 34 357j

40 375j 64 392q 81 434q

144 644q 162 734j 169 770q

176 1002j 192 1005j 224 1006j

252 1007j 256 1025j 280 1026j

288 1031j 320 1121j 324 1123j

338 1217j 361 1358q 378 1455j

384 1456j 448 1461j 504 1464j

512 1504j 560 1507j 576 1519j

640 1609j 648 1614j 676 1708j 722 1849j 756 1946j 768 1947j

800 1952j 896 2023j 924 2029j 1008 2047j 1024 2096j 1120 2102j

1152 2123j 1280 2213j 1296 2223j 1352 2317j 1408 2458j 1444 2461j

1512 2558j 1536 2559j 1600 2565j 1620 2706j 1792 2707j 1848 2717j

4096 2744q 6561 3038q 6864 3614j 8192 3632j 10 000 3926j

The first bound given uses the CPHFs produced here in conjunction with the known recursive constructions, and all other directconstructions of which we are aware. The second entry gives the bound calculated in the same manner, but omitting the CPHFdirect constructions. Two things are striking. The impact of the “small” arrays produced on the recursions makes a substantialimprovement for v ∈ {2, 3, 4}. Perhaps more surprising is the improvements for 5�v�9, since these result from the use in therecursions of covering arrays of strengths 3 and 4. Since CPHF constructions improve the bounds on CAN(t, k, v) for t ∈ {3, 4},improvements arise for strength five as well.

We provide more detailed information in Tables 8–9 for v ∈ {2, 3, 4}, presenting the tables after the use of covering perfecthash families.

Let �(N; t, v) be the largest k for which CAN(t, k, v)�N. As k increases, for many consecutive numbers of factors (columns),the covering array number does not change. Therefore reporting those values of �(N; t, v) for which �(N; t, v) >�(N−1; t, v), alongwith the corresponding value of N, enables one to determine all covering array numbers when k is no larger than the largest�(N; t, v) value tabulated. Since the exact values for covering array numbers are unknown in general, we in fact report lowerbounds on �(N; t, v).

For each strength in turn, explicit constructions of covering arrays from direct and computational constructions are tabulated.Then each known construction is applied and its consequences tabulated (in the process, results implied by this for fewer factorsare suppressed, so that one explanation (“authority”) for each entry is maintained). Applications of the recursions is repeateduntil no entries in the table improve.

The authorities used are:

h perfect hash family (Martirosyan and Van Trung, 2004) i Roux-type doubling (Martirosyan and Van Trung, 2004)j Roux-type doubling (Martirosyan and Colbourn, 2005) o orthogonal array (Hedayat et al., 1999)q Turán squaring (Hartman, 2005) s simulated annealing (Cohen, 2004, 2005)u miscellaneous direct construction v permutation vector (this paper)z composition

For each v, we tabulate the entries for N and �(N; 5, v). We also provide a plot showing the logarithm of the number of factorshorizontally and the size of the covering array vertically. The plot simply demonstrates the growth rate, and the computed boundis that provided by the points tabulated.

Note Added: Since this research was completed, three new methods have been proposed for the construction of coveringarrays of higher strength: FireEye (Lei et al., 2007), density (Bryce and Colbourn, 2007), and PaintBall (Kuhn, 2006). Each is anheuristic method, and each results in some improvements to the tables presented.

Page 10: Tabu search for covering arrays using permutation vectors

78 R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80

Table 9Bounds for �(N; 5, 3)

6 243o 7 377s 8 457s

10 483v 13 723v 16 963v

19 1203v 20 1530i 26 2007j

32 2247j 38 2643j 40 2970j

46 3555j 48 3711j 52 3765j

60 4005j 64 4113q 81 4347q

86 5733j 92 5787j 96 5943j

102 5997j 104 6335j 169 6507q

256 8667q 258 9945j 264 9949j

268 9977j 276 10031j 288 10035j

306 10051j 308 10299j 312 10301j

322 10377j 324 10393j 338 10413j

361 10827q 388 12661j 396 12715j 414 12771j 432 12929j 444 12937j

450 12949j 456 12957j 468 12965j 512 12969j 516 14429j 522 14433j

528 14449j 536 14477j 540 14531j 552 14661j 576 14669j 594 14713j

600 14729j 612 14745j 616 15059j 624 15061j 644 15137j 648 15153j

666 15181j 676 15205j 684 15619j 702 15635j 704 15671j 722 15755j

736 17589j 738 17657j 756 17665j 774 17673j 776 17705j 788 17759j

792 17867j 800 17947j 828 17959j 840 18117j 852 18137j 864 18161j

888 18197j 894 18209j 900 18225j 912 18233j 918 18241j 936 18285j

948 18289j 968 18343j 1024 18369j 1032 19829j 1044 19833j 1056 19849j

1058 19877j 1072 20153j 1080 20207j 1104 20337j 1134 20345j 1152 20451j

1176 20495j 1188 20499j 1200 20653j 1224 20669j 1232 20983j 1242 20985j

1248 21155j 1288 21231j 1296 21247j 1332 21291j 1350 21343j 1352 21359j

1368 21773j 1404 21797j 1408 21837j 1428 21921j 1444 21975j 1458 23809j

1472 23897j 1476 23965j 1512 23973j 1536 23981j 1548 24117j 1552 24149j

1576 24203j 1584 24311j 1600 24391j 1656 24403j 1680 24561j 1704 24581j

1708 24605j 1728 24659j 1776 24695j 1788 24707j 1800 24723j 1824 25031j

1836 25039j 1872 25153j 1896 25157j 1936 25211j 1944 25237j 1998 25253j

2048 25289j 2052 26749j 2064 26773j 2088 26777j 2106 26793j 2112 26833j

2116 26897j 2144 27221j 2160 27275j 2208 27409j 2214 27485j 2268 27497j

2272 27607j 2304 27615j 2322 27659j 2352 27703j 2376 27707j 2400 27893j

2448 27913j 2464 28243j 2484 28245j 2496 28415j 2520 28491j 2556 28499j

2576 28539j 2592 28555j 2640 28627j 2664 28647j 2700 28699j 2704 28715j

2736 29129j 2754 29153j 2808 29225j 2816 29265j 2856 29349j 2880 29403j

2888 29409j 2916 31243j 2944 31331j 2948 31399j 2952 31507j 3024 31515j

3042 31523j 3072 31555j 3096 31691j 3104 31723j 3132 31777j 3152 31781j

3168 31889j 3174 31985j 3200 32245j 3240 32257j 3312 32277j 3360 32435j

3402 32455j 3408 32497j 3416 32521j 3456 32575j 3510 32611j 3528 32635j

3552 32687j 3564 32699j 3576 32759j 3592 32775j 3600 32829j 3648 33137j

3672 33145j 3744 33259j 3792 33263j 3872 33317j 3888 33343j 3996 33359j

4096 33395j 4104 34855j 4128 34879j 4176 34883j 4212 34899j 4224 34939j

4232 35003j 4288 35327j 4320 35381j 4416 35515j 4428 35591j 4536 35603j

4544 35713j 4608 35721j 4644 35765j 4704 35809j 4728 35813j 4752 35867j

4800 36053j 4896 36073j 4928 36403j 4968 36405j 4992 36575j 5040 36651j

Page 11: Tabu search for covering arrays using permutation vectors

R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80 79

Table 10Bounds for �(N; 5, 4)

6 1024o 11 2044v 15 3064v

16 5856j 20 6064j 22 6168j

24 7292j 26 7396j 28 8152j

30 8256j 32 11048j 40 11256j

44 12116j 48 13240j 52 13656j

56 14412j 58 14516j 60 14620j

62 17516j 64 18272j 68 18480j

76 18584j 80 18688j 84 19652j

121 20440q 124 27242j 128 27998j

130 28206j 136 28893j 138 29048j

152 29152j 160 29274j 169 30640q

170 31836j 190 31932j 192 32067j

200 32142j 216 32523j 224 32568j 232 32613j 240 32970j 242 33015j

248 39817j 250 41245j 256 41269j 260 41582j 272 42269j 276 42484j

280 42588j 288 42737j 290 42841j 300 42886j 304 42931j 310 43053j

320 43290j 328 44764j 336 44868j 338 45369j 340 46565j 352 46769j

368 47180j 372 47284j 380 47341j 384 47476j 400 47655j 416 48111j

420 48180j 432 48801j 448 48846j 464 48891j 472 49248j 480 49352j

484 49433j 496 56235j 500 57663j 512 57687j 520 58495j 528 59808j

544 59976j 552 60191j 560 60295j 568 60444j 576 60548j 580 60652j

600 60697j 608 60778j 620 60900j 640 61137j 656 63343j 660 63447j

664 63777j 672 63881j 676 64382j 680 65578j 682 65782j 704 65857j

722 66268j 736 66517j 744 66621j 760 66678j 768 66948j 800 67229j

832 68066j 840 68135j 864 68756j 896 68846j 928 68936j 944 69338j

960 69442j 968 69811j 992 76613j 1000 78713j 1024 78761j 1040 79674j

1056 80987j 1088 81155j 1104 81727j 1120 81831j 1136 82337j 1152 82441j

1160 82545j 1200 82635j 1216 82761j 1240 82883j 1280 83357j 1312 85671j

1320 85775j 1328 86105j 1344 86209j 1352 87211j 1360 88511j 1364 88823j

1392 88898j 1408 89002j 1444 89824j 1472 90073j 1488 90177j 1520 90291j

1536 90561j 1584 90842j 1600 90946j 1664 91858j 1680 91996j 1728 93004j

1776 93094j 1792 93198j 1856 93288j 1888 93794j 1920 93898j 1922 94267j

1936 94942j 1984 101744j 2000 103844j 2048 103892j 2080 105390j 2100 106772j

2112 107393j 2176 107561j 2208 108133j 2220 108237j 2240 108741j 2272 109247j

2280 109351j 2304 109455j 2320 109559j 2360 109649j 2400 109685j 2432 109847j

2480 109969j 2512 110443j 2560 110547j 2584 113959j 2624 114100j 2640 114204j

2656 114786j 2664 114890j 2688 114968j 2704 115970j 2720 117270j 2728 117582j

2744 117687j 2784 117791j 2816 117895j 2888 118717j 2944 119329j 2976 119478j

3040 119592j 3072 119865j 3168 120146j 3200 120250j 3208 121162j 3328 121266j

3360 121404j 3362 122412j 3456 123222j 3496 123312j 3552 123450j 3584 123554j

3712 123689j 3776 124321j 3800 124425j 3840 124470j 3844 125352j 3872 126027j

3936 132829j 3968 132874j 4000 135646j 4072 135718j 4096 135724j 4160 137282j

4200 138664j 4224 139285j 4256 139453j 4352 139543j 4370 140526j 4416 140619j

4440 140723j 4480 141227j 4544 141778j 4560 141882j 4608 141986j 4640 142090j

4720 142333j 4750 142369j 4800 142414j 4864 142729j 4960 142851j 5024 143562j

Page 12: Tabu search for covering arrays using permutation vectors

80 R.A. Walker, C.J. Colbourn / Journal of Statistical Planning and Inference 139 (2009) 69 -- 80

6. Conclusions

By utilizing the compact search space afforded by covering perfect hash families, tabu search is able to find smaller arrays forhigher strength more efficiently. This efficient representation of a covering array enables searches for t�5. The resulting arraysfor small k improve on the best known bounds for many larger k, utilizing the arrays in recursive constructions.

Acknowledgments

We thank to Sosina Martirosyan for helpful discussions regarding the non-covering conditions. We also thank the referees forimproving the presentation.

References

Bryce, R.C., Colbourn, C.J., 2007. A density-based greedy algorithm for higher strength covering arrays. Software Testing Verif. Reliab., to appear.Chateauneuf, M., Kreher, D., 2002. On the state of strength-three covering arrays. J. Combin. Des. 10 (4), 217–238.Cohen, M.B., 2004. Designing test suites for software interaction testing. Ph.D. Thesis, University of Auckland.Cohen, M.B., 2005. Private communications.Cohen, D.M., Dalal, S.R., Fredman, M.L., Patton, G.C., 1997. The AETG system: an approach to testing based on combinatorial design. IEEE Trans. Software Eng. 23

(7), 437–444.Cohen, M.B., Colbourn, C.J., Ling, A.C.H., 2008. Constructing strength 3 covering arrays with augmented annealing. Discrete Math. 308, 2709–2722.Colbourn, C.J., 2004. Combinatorial aspects of covering arrays. Le Matematiche (Catania) 58, 121–167.Colbourn, C.J., Martirosyan, S.S., Van Trung, T., Walker II, R.A., 2006. Roux-type constructions for covering arrays of strengths three and four. Designs Codes Crypt.

41, 33–57.Fleurant, C., Ferland, J.A., 1996. Genetic and hybrid algorithms for graph coloring. Ann. Oper. Res. 63, 437–461.Glover, F., Laguna, M., 1997. Tabu Search. Kluwer Academic Publishers, Norwell MA.Hartman, A., 2005. Software and hardware testing using combinatorial covering suites. In: Golumbic, M.C., Hartman, I.B.-A. (Eds.), Interdisciplinary Applications

of Graph Theory, Combinatorics, and Algorithms. Springer, Norwell, MA, pp. 237–266.Hartman, A., Raskin, L., 2004. Problems and algorithms for covering arrays. Discrete Math. 284, 149–156.Hedayat, A.S., Sloane, N.J.A., Stufken, J., 1999. Orthogonal Arrays, Theory and Applications. Springer, Berlin.Kuhn, D.R., 2006. An algorithm for generating very large covering arrays, Internal Report 7308, National Institute of Standards and Technology.Lei, Y., Kacker, R., Kuhn, D.R., Okun, V., Lawrence, J., 2007. IPOG: a general strategy for t-way software testing, submitted for publication.Martirosyan, S.S., Colbourn, C.J., 2005. Recursive constructions for covering arrays. Bayreuth. Math. Schr. 74, 266–275.Martirosyan, S.S., Van Trung, T., 2004. On t-covering arrays. Designs Codes Crypt. 32, 323–339.Nurmela, K., 2004. Upper bounds for covering arrays by tabu search. Discrete Appl. Math. 138, 143–152.Shasha, D.E., Kouranov, A.Y., Lejay, L.V., Chou, M.F., Coruzzi, G.M., 2001. Using combinatorial design to study regulation by multiple input signals: a tool for

parsimony in the post-genomics era. Plant Physiology 127, 1590–1594.Sherwood, G., Martirosyan, S.S., Colbourn, C.J., 2006. Covering arrays of higher strength from permutation vectors. J. Combin. Design 14, 202–213.Sloane, N.J.A., 1993. Covering arrays and intersecting codes. J. Combin. Design 1, 51–63.Walker, R.A. II, 2005. Covering Arrays and Perfect Hash Families, Ph.D. Thesis, Computer Science and Engineering, Arizona State University.Walker, R.A. II, Colbourn, C.J., 2007. Perfect hash families: constructions and existence. J. Math. Crypt. 1, 125–150.