a tabu-search-based algorithm for continuous multiminima problems

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERINGInt. J. Numer. Meth. Engng 2001; 50:665–680

A tabu-search-based algorithm for continuousmultiminima problems

Francesco Franz�e∗ and Nicol�o SpecialeDEIS Universit�a di Bologna; Viale Risorgimento 2; 40136 Bologna; Italy

SUMMARY

Tabu search algorithms are becoming very popular in operational research community. A lot of works andstudies were carried out from the �rst presentation of Glover. The development of tabu search techniques con-cerns in almost all cases combinatorial problems, and we found very few papers about continuous problems.In this work, we brie y classify and describe the main continuous approaches to tabu search, then we willpresent a novel algorithm which explores a grid of points with a distance dynamically de�ned, it collapses toa local minimum then it continues the search from that point accepting some non-improving points to allowthe exploration of new regions of the domain. The proposed algorithm is deterministic with a little randomcomponent triggered only when loop conditions are detected and it contains a simple vocabulary buildingmechanism and a diversi�cation procedure. Finally we show some comparisons with other optimization algo-rithms and a possible application of this method to an engineering problem. Copyright ? 2001 John Wiley& Sons, Ltd.

KEY WORDS: tabu search; optimization; parameter extraction

1. INTRODUCTION

Tabu search [1; 2] is not exactly an optimization algorithm, but a collection of guidelines todevelop optimization algorithms (this is the reason we call it ‘meta heuristic’). Formally, a generaloptimization problem can be stated as

min c(x)

x∈A⊆Xwhere c :X →R is the cost function to minimize, A is a set of constraints and X is the domainof the problem.The basic idea of tabu search is to explore the trial solutions for the problem, moving from a

point to another in its neighbourhood which is composed of solutions which have little di�erencesfrom this one. Reverse moves and cycles are avoided by the use of a ‘tabu list’ where the movespreviously done are memorized.

∗Correspondence to: Francesco Franz�e, DEIS Universit�a di Bologna, Viale Risorgimento 2, Bologna 40136, Italy

Received 27 July 1999Copyright ? 2001 John Wiley & Sons, Ltd. Revised 6 January 2000

666 F. FRANZ�E AND N. SPECIALE

Figure 1. A generic tabu search algorithm.

A tabu search algorithm should follow the general framework shown in Figure 1, where TABUrepresents the tabu list, Neighbourhood, the neighbourhood of a given point, choose is thechoice method, TermCondition, the termination conditions for the algorithm and TabuUpdate isthe method for updating of tabu list.Together with these major characteristics there are other di�erent features like aspiration condi-

tion; diversi�cation; intensi�cation; strategic oscillation; vocabulary building which allow the de-velopment of more sophisticated algorithms with better performances [3].The tabu search techniques were initially developed for combinatorial problems and its appli-

cation to continuous domains was only super�cially studied. As well known, the main di�erencebetween a combinatorial problem and a continuous one is the shape of the search domain X : ina combinatorial problem we have only a �nite number of trial points (although it is exponentiallylarge); otherwise, in a continuous problem the domain has in�nite cardinality and it is usually asubset of Rn: this fact forbids an exhaustive search on the whole domain.The choice of moves is an important phase: it is possible that we have to cope with a neigh-

bourhood of in�nite cardinality so we have to use some strategy (like candidate list [4]) to allowthe search.We found in literature some di�erent methods to use tabu search with continuous problems,

which can be summarised by the following three approaches:

• In the �rst approach the problem is discretized and thus only a �nite number of points isanalysed during the search as in Reference [5] (each point could be represented by a �nitebinary string). The level of discretization �xes the accuracy of solution and it generates a�nite cardinality problem. A high accuracy has the drawback to generate �nite problems verylarge and di�cult to handle.

• The second approach is based on the changing of the level of abstraction for the problem:discrete subregions with decreasing dimension instead of points, are considered as elementsof the problems. In this case the tabu search algorithm is used to �nd the most promisingregions which could contain the optimal point and then local optimization methods are usedto �nd the best point. An example of this kind of algorithm was presented in Reference [6].

• The third approach recalls the geometric interpretation of the problem: the neighbourhoodof a point is represented by the set of points ‘near’ (given a metric) to it. In this case thescanning of whole neighbourhood is not allowed, because it is in�nite; usually a candidate list

Copyright ? 2001 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2001; 50:665–680

CONTINUOUS MULTIMINIMA PROBLEMS 667

strategy which makes a (pseudo)-random sampling of the neighbourhood is used. With thismechanism it is almost impossible to visit again a previous point, so the tabu list is organizedwith n-dimensional ‘balls’ with a given radius. Thus a point is tabu if it falls inside a ballcontained in the tabu list. The geometrical approach was initially presented in Reference [7],then it was proposed again in Reference [8], and further developed in Reference [9].

The proposed algorithm is quite di�erent from these. It uses the idea of discretization of thesearch domain, but this is not �xed a priori because the grid size which represents the domaincan change dynamically.The detailed description of our algorithm is introduced in Section 2 and further explained in

Section 3. In Section 4 obtained results and a comparison with other algorithms are presented,in Section 5 parameter description and tuning strategies are explained, and �nally in Section 6 apossible application of this optimization algorithm to a real engineering problem is proposed.

2. DESCRIPTION OF THE DOPE ALGORITHM

DOPE is a new algorithm based on pattern search and tabu search. We investigate the performanceof tabu search in �nding global minimum of multidimensional problems with variables spanningover continuous ranges. Our �nal purpose is to apply it to the speci�c electronic engineeringproblem of device parameter extraction. For this problem we have to minimize a multidimensionalfunction with several local minima. Furthermore, its analytical form is unknown (hence gradientand Hessian matrix are not available) and its evaluation has a high computational cost.The development of tabu search techniques concerns in almost all cases combinatorial problems,

so we thought the �rst thing to do was to �nd a combinatorial problem which could describe theinitial one. A common idea is to discretize the continuous space by considering only a �nite gridof points at a given distance. This distance is a critic parameter because it de�nes the accuracyfor locating the minimum point. A better accuracy requires a smaller distance, otherwise a smallerdistance implies a grid with a larger number of points. Moreover, the number of points to examinegrows exponentially with the number of variables of the function to optimize: we will call thisnumber the ‘dimension’ of the problem. Given these premises, it is easy to see that a trivialalgorithm which moves from a point to an adjacent one on the grid looking for the optimum willbe very ine�cient.We will describe our algorithm like the result of successive re�nements applied to a simple

tabu search algorithm like that presented in Figure 1 and we will introduce some parametricvalues identi�ed by verbatim style. These ‘algorithm parameters’ can a�ect the performance andcan be tuned by the user for speci�c cases as described in Section 5.In all tabu search algorithm we have to de�ne states, moves thus the neighbourhood, tabu list,

a choice procedure and a termination condition. In this ‘simple tabu search algorithm’, a state isa point of the multidimensional domain of variables, hence we make a move by passing from apoint to an adjacent one.Before de�ning adjacency we have to de�ne a distance function between two points

p1 = (x11; x12; : : :); p2 = (x21; x22; : : :)

as

dist(p1; p2)= maxi(|x1i − x2i|)



Here � is the minimal distance (from hereafter ‘resolution’) between two points on the grid,hence p1 and p2 are adjacent if dist(p1; p2)= �.The neighbourhood of a point is the set of its adjacent points; the size of the neighbourhood

depends exponentially on the dimension of the problem. The choice function simply chooses thenon-tabu move which leads to the better point of the neighbourhood even if it could also be worsethan the current point. The tabu list is composed of the reverse of moves previously acceptedby choice function, and a move becomes non-tabu after a given number of iteration de�ned byparameter tenure. Finally, the termination condition is given either by a maximum number offunction evaluation (indicated by parameter maxfev or when the minimum (if it is a priori known)is found. Since we operate with variables over a continuous domain it is highly improbable to �ndthe exact minimum, so we relax this requirement by de�ning that ‘the minimum is found’ whenwe �nd a point whose value from optimum value is smaller than a given error (err).

3. IMPROVING THE ALGORITHM

In the previous section a simple implementation of tabu search is described; it works properly,but its performance is not satisfactory. In this section we highlight which are the main reasons ofine�ciency and try to cope with them.

3.1. Variable move step

There is no good reason to move from a point to another with a step de�ned by the resolution (i.e.�) between two points. We can start the search moving around using a step which is a multiple ofresolution: this allows the exploration of a larger space during the �rst iterations of the algorithm.Better points can be found in a small amount of time, then it is possible to shrink the step tomake more accurate moves around good points. In a tabu search interpretation we could say thatthe variability of step size allows diversi�cation and intensi�cation.The value of resolution is de�ned by parameter res. From the value of err a suited res value

should be chosen, i.e. if we require a very small error on the �nal value we also need the resolutionof the points on the grid is small enough.

3.2. Rede�nition of neighbourhood

From geometry theory, a vectorial space can be spanned with a positive basis. A set of vectorsis a positive basis if we can de�ne each vector of the space like a positive linear combination ofbasis vector:

B(S)={vi | vi ∈ S and ∀s∈ S ∃ci ¿ 0 s=

∑icivi

}

By doing so, only 2N directions and not 2N are examined; this is exploited by a simple rede�nitionof the distance function:

dist(p1; p2)=∑i(|x1i − x2i|)

This introduces the drawback of ine�ciency of moving along the positive basis, when the minimumand the current point do not lie along the directions de�ned by the positive basis, because a very



small step is needed to locate the minimum. Otherwise the algorithm is unusable with high-dimensional problems: in an eight-dimensional problem, the function is evaluated 28 = 256 timesat each iteration, without the certainty to �nd an improving point.

3.3. A humble choice procedure

Although the enumeration of the neighbourhood has a linear cost in the dimension, it is notcheap. Hence to reduce the number of function evaluations at each iteration we modify the choiceprocedure: unlike to �nd the best move, we accept an improving move if it exists. This leads toa higher number of iterations, but reducing the number of function evaluations at each iterationthe total number of evaluations (which represents the real cost of the algorithm) will reduce. Forexample, if the analysis of whole neighbourhood requires 10 function evaluations and we need�ve iterations to converge, using our mechanism we need 10 iterations with an average number offunction evaluations of 3, so we will have an improvement of (5×10)−(10×3) = 20 evaluations.The choice procedure has another important feature: it starts the search from the last chosen

direction, so the polling around these directions is not �xed and it tries the last good directionsupposing that a good direction will be still good in the next iteration.

3.4. Monotonic behaviour

Since we suppose to cope with quite good functions with a reasonable number of local minima, itis more suitable to converge quickly to a local minimum then move away to �nd other minima.This is implemented by reducing the step when no good point is found in the neighbourhood.It is quite easy to see that we collapse to a local minimum. Given the de�nition of continuous

neighbourhood for a point q:

S(q)= {p; ‖p− q‖¡�}

a local minimum �p can be formally de�ned as

�p = argmin(q;∀p∈ S(q); f(q)¡f(p); p 6= q)

When the step is reduced to a prede�ned minimal resolution, we �nd a point �p with a neigh-bourhood composed of the points with a greater value, so this point satis�es a discretized localminimum property (i.e. a local minimum property with the previously given de�nition of neigh-bourhood):

N (q) = {p; dist(p; q)¡res}

�p= argmin(q;∀p∈N (q); f(q)¡f(p); p 6= q)

After �nding the minimum, we continue from that point reinitializing the step, and allowing theacceptance of some non-improving moves (there is a �xed number of non-improving moves wecan accept, but it could be a parameter of algorithm). The tabu list then forbids to fall again onthe last minimum, but to reinforce this behaviour we add a second tabu list (with tenure de�nedby parameter tenure2) which ‘remembers’ last accepted points, because the move tabu list doesnot �t well with a toroidal space (like we have).



Again, we can �nd a tabu search framework to describe this strategy, named tabu thresholding,where there is a value (the threshold) which controls the switching between a monotonic phaseand a non-improving phase.The shrinking of the step is not all the time monotonic: if during the improving phase we found

lots of consecutive improving moves, we noted that it could be that the step was too small, sowe redoubled it. Anyway when non-improving points were found it would be reduced again.

3.5. Vocabulary building

Moving along the co-ordinate directions can produce long paths towards minimum, otherwise, ifwe choose some improving moves, we can try to apply a clustering hypothesis thinking that nextgood moves will be similar to previous ones. We store a ‘su�cient’ number of good moves tobuild new compound moves which are tried before the polling of co-ordinate directions.It is possible to tune the number of allowed compound moves (indicated by parameter comp)

and the number of basic moves to build up a compound move (indicated by build); a futurework will deeply investigate the in uence of these parameters on algorithm performance. We canassociate the building of compound moves like the building of a trivial model for the functionto minimize, but this technique is also a good example of tabu search vocabulary building whichallows the introduction of new type of moves built from some basic ones. In our case basic movesare the 2N co-ordinate moves and a compound move is the composition of a pre�xed number ofbasic moves.

3.6. Diversi�cation and reinitialization

During the search it is possible to visit a region near an old minimum. In this case, we think itis possible to �nd again this minimum, so we set up an escape procedure which is inspired bydiversi�cation guidelines: when the algorithm falls in the basin of a visited minimum it jumpsto another point taking the symmetric one along a co-ordinate direction to a local minimum withrespect to the initial point of the search; every time this escape sequence is triggered, a di�erentco-ordinate direction is chosen, when all directions are exhausted a di�erent local minimum isselected.If all local minima are exhausted, the search continues from the last point and we are quite

guaranteed that the algorithm will not fall on the same points thanks to the tabu lists. Otherwisethe algorithm cannot locate a new minima, it will restart from the same last point several times,in this case the tabu lists cannot be useful. To avoid generation of a cyclic behaviour, a randomcomponent is introduced: when the algorithm restarts from a visited point for ‘enough’ times (loopcondition) we force the restart from a random point with a reinitialization procedure.This reinitialization procedure is also triggered every time the algorithm, avoiding the diversi�-

cation, falls in a previously seen minimum, failing to �nd a new one.To sum up, we present two DOPE schemes, the former is the ow chart of the algorithm

(Figure 2) while the latter is in C-like style (Figure 3):

4. EXPERIMENTAL RESULTS

Tabu search does not state strong requirements but it gives some guidelines to developalgorithms, so it is very di�cult to �nd general theoretical results because it is quite



Figure 2. General ow chart of DOPE algorithm.

unspeci�ed. We would like to have a theorem which states global convergence for tabu searchalgorithm like simulated annealing [10; 11]. As well known, this approach is founded on thecondition of random sampling of domain, since a random procedure will eventually sample allthe search space with a given accuracy [12]. The random component introduced in DOPE sug-gests that such a theoretical basis for convergence could exist but we have to investigate it moredeeply. In this section we studied informally the performance of DOPE to have a gross esti-mate of it, since we found interesting results we are encouraged to look for stronger theoreticalresults.To analyse the behaviour of DOPE we tested it on some classical functions [9; 12]. These

test functions represent a good benchmark for optimization algorithms, we have smooth functions(i.e. dejoung, zakharov), functions with a narrow valley (i.e. rosenbrock), functions with some farlocal minima (i.e. goldstein), functions with many near local minima (i.e. shubert). For the sake



Figure 3. C-like DOPE scheme.

of brevity we report only the analytical formulae for zakharov and shubert functions

zakharov(x; n) =n∑i=1x2i +

(n∑i=10:5ixi

)2+(

n∑i=00:5ixi

)4

shubert(x; y) =(

5∑i=0(ix cos(ix) + i)

)(5∑i=0(iy cos(iy) + i)

)

For each function we made 1000 trials with di�erent starting points, obtaining good average valuesfor the number of function evaluations (we will call it fev). Our interest was focused on this valuesince we are supposed to cope with functions with high computational cost, although this is notthe case for test functions.We report the comparison between DOPE and other algorithms. In particular we considered

ECTS (enhanced continuous tabu search) a continuous tabu search algorithm presented in Reference[9] and ESA (enhanced simulated annealing) presented in Reference [13]. The test functions whichappear in Table I are described in Reference [9]. The table shows the average number of fev neededby each algorithm to locate the global minimum with an error described in the 5th column. ECTShas some stop conditions leading to these results, otherwise DOPE uses the average error of ECTSlike a stopping criterion. Without this trick, we would have an algorithm which sometimes reachesits maximum number of iteration with a minor error distance to the solution.The performance of DOPE depends strongly on the starting point: guessing a good point it

converges quickly, otherwise it could converge slowly or it could �nd lots of local minima before



Table I. Comparison between average number of function evaluations used bydi�erent algorithms (ECTS, ESA) and DOPE.

Cost function DOPE ECTS ESA Avg. error

rcos 31 245 — 0.05easom 290 1284 — 0.01goldstein 248 231 783 2× 10−3shubert 466 370 — 1× 10−3rosenbrock 2 692 480 796 0.02zakharov 2 81 195 15 820 2× 10−7dejoung 131 338 — 3× 10−8hartman 3 135 548 698 0.09rosenbrock 5 2512 2512 5364 0.08zakharov 5 424 2254 69 799 4× 10−6hartman 6 421 1520 2638 0.05rosenbrock 10 8695 15 720 12 403 0.02zakharov 10 5133 4630 — 2× 10−7

Table II. Test functions statistics.

Cost function Min fev Max fev Avg fev Deviation Deviation (%)

rcos 2 63 31 9.28063 29easom 13 1799 290 290.321 100goldstein 20 2783 248 298.235 120shubert 53 3018 466 485.68163 104rosenbrock 2 7 3291 692 670.359 96zakharov 2 49 110 81 9.19367 11dejoung 99 158 131 9.50101 7hartman 3 1 1580 135 184.362 136rosenbrock 5 102 24 690 2512 3060.8 121zakharov 5 264 661 424 56.8973 13hartman 6 7 14 269 421 1236.88 293rosenbrock 10 383 maxfev 8695 6372.36 73zakharov 10 2958 6874 5133 520.616 10

locating the global one, so it needs the trigger of diversi�cation mechanisms which are timeexpensive. Trials with di�erent starting points could behave very di�erently: this is mathematicallydescribed by the great deviation value obtained analysing a set of trials as showed in Table II.The percentile reported in the last column represents the normalized value for deviation with thecorrespondent average value.We noted that slow convergence is mainly due to long improving phases with very small step,

e.g. in rosenbrock function, if DOPE falls in the valley of the minimum, it follows the pathto minimum quite slowly. Anyway DOPE was initially studied for engineering problems so thecapability to discover and avoid lots of local minima is an important feature, meanwhile the shapeof cost functions is not so ‘evil’ like as rosenbrock function. Since DOPE is developed in amodular architecture, we could improve the choice procedure (based on search among compoundand basic moves) introducing more complex models of cost function (a gradient method is aninteresting beginning approach under investigation).



Table III. Main DOPE parameters.

Name Description Default value

delta0 Initial (upperbound) step size value 0.25tenure Size of moves tabu list 1tenure2 Size of states tabu list 3build Number of basic moves de�ning a compound move Dimensioncomp Size of stored compound moves Dimensionmaxnonimp Maximum number of accepted non-improving moves 3loop Maximum number of bad diversi�cations before a restart 6res Step resolution 10−5

err Tolerance 10−4

maxfev Maximum number of function evaluations 200 000

In the next section we report a detail description of all main parameters and the tuning strategiesadopted to obtain the shown results.

5. DOPE PARAMETERS DESCRIPTION AND TUNING STRATEGIES

The proposed algorithm has about 10 most important tuning parameters, reported in Table III.We investigated the in uence of these parameters in order to set the most important ones to a

default value and obtain good typical performances. Some default values are not �xed but dependon the dimension of the problem: in these cases the relation was found empirically.DOPE parameters can be divided into two groups. The �rst one is used for statistical purposes

only: we have parameters to choose the test function, its range and its dimension, the number oftests to do and so on. For the sake of brevity we do not consider these any more. Otherwise,we focus on the second group which a�ects the performance of the algorithm. We give a fulldescription of them and some issues on their sensitivity and the way to �nd good values.

• delta0 de�nes the initial step size for the algorithm and it represents also the upperboundvalue when step is increased. From Table IV we note that this is quite an in uencing para-meter. Considering a normalized domain in [0,1], when the starting point is the middle ofthe domain with delta0=0:5, the bounds for the di�erent dimensions are visited duringthe second step, while with a delta0=1 the starting point is wrapped again. A very smalldelta0 value forces the initial search in a small region, but generally this is not a good idea;in all our tests delta0=0:25. Delta0 should be in any case smaller than 1.

• tenure �xes the size of tabu moves list by using the tabu list on states. The size of this listcan be kept to 1, in order to avoid inverse move.

• tenure2 de�nes the size of the states tabu list.

The role of tabu lists was investigated by analysing the performances of DOPE with di�erentvalues for the two lists. At a �rst look, the obtained results could be misleading: in fact, thebene�ts of tabu lists are few, and sometimes the algorithm works well also without the tabu lists.We found a motivation for this. As well known, we will be trapped in a local minimum if during

the search we accept a non-improving move and, in the following iteration the best improving moveof the neighbourhood. Without a tabu list, we could cycle between this two points, since DOPE



Table IV. Tests with di�erent delta0 values.

delta0 0.01 0.05 0.1 0.25 0.5 1rcos 35 26 26 30 32 35hartman 3 410 321 301 132 140 117dejoung 10 156 138 136 735 1140 945rosenbrock 2 2987 2696 2343 2455 2461 2552zakharov 2 120 85 82 78 79 82shubert 7109 428 136 470 456 433goldstein 346 206 145 203 189 180easom 50 193 1573 988 301 287 307zakharov 5 795 515 437 402 386 395hartman 6 2122 1229 727 594 2029 1901rosenbrock 5 5762 2876 3555 3062 3322 3154zakharov 10 6926 4949 4716 4600 4364 4348rosenbrock 10 12 442 9077 10 102 12 708 12 547 11 744

accepts a simple improving move during the improving phase and not the best one; we can avoidin these cases the local minima traps. Moreover, we can explain why the performances are not amonotonic function of tabu lists size: by augmenting the size of tabu list, we avoid to repeat somemoves but, if the size is too large we could also prune in some good directions. Table V showsthe behaviour of DOPE for some test functions with di�erent values for the tenure of tabu lists.There is still no theoretical analysis about the best value for tabu tenure: we could state that it

depends on the function to minimize and on the dimension of the problem; for large dimensionproblems DOPE performs better with large tenure.

• build is the number of basic moves used to de�ne a compound move; the default value ofthis parameter depends on the dimension of the problem and it is a �xed equal to this value.A larger value is suggested for problems where the parameters are highly related: by doing sogood performances were obtained avoiding to follow co-ordinate axis using the basic moves.Table VI shows some examples: we found some cases where DOPE performs better on anaverage but worse in the worst case, if build is bigger than the dimension, we could de�necompound moves which behave like a doubling of the step size.

• comp is the dimension of stored compound moves list. At each iteration the compound listis scanned, so its size could a�ect the performances signi�cantly, (see Table VI), when sizeis set at 0, the use of compound moves is inhibited. The default value is parametric to thedimension of the problem and is �xed equal to it.

• maxnonimp de�nes the maximum number of non-improving moves accepted during the non-improving phase, before starting a new monotonic phase. Small values could force diversi-�cation and reinitialization to avoid the found local minima. From Table VII we could seethat good value is correlated with the function to optimize, but it is quite di�cult to forecastgood value without any trial. Anyway a good rule of thumb is to try di�erent values onlyafter the setting of tenure, tenure2, build and comp parameters.

• loop tunes the restart mechanism: it de�nes the number of times the diversi�cation procedureis allowed to give out a known minimum before the triggering of reinitialization mechanism.We always used the default value loop=6, but smaller value could improve the performancewhen we are supposed to have lots of local minima. Otherwise, if there is a small numberof local minima a larger number of diversi�cation does not guarantee to �nd new minima



Table V. Tests with di�erent tabu tenures.

tenure2 tenure rcos goldstein hartman 6 zakharov 10

0 0 37 231 419 55961 31 246 502 52502 29 239 5363 32 257 6274 5185 481 501310 4618

1 0 34 259 446 53831 33 279 520 53732 32 227 3323 31 259 4264 4075 229 503710 4616

2 0 32 307 425 54121 31 305 343 52622 30 237 5083 30 252 5434 4755 269 511610 4617

3 0 31 247 432 53261 33 242 303 52492 31 157 5983 30 216 3154 4005 418 500310 4573

7 0 31 163 412 53611 31 171 462 54102 29 200 6073 30 198 4604 5045 993 505710 4542

because it could favour the fall on a previous local minimum more than other diversi�cationmethods.

• res de�nes the resolution of the problem, i.e. the smallest step the algorithm accepts before tostate it has found a local minimum. Also this parameter could a�ect the performance strongly.The default value is 10−5 but greater values could bring to �nd false minima. Smaller valuescould augment the time of convergence.

• err de�nes the accepted distance to the known minimum; this situation frequently happens inengineering problems where the optimization problem is a minimization problem and the costfunction is always positive and has a theoretical minimum of zero. Obviously, this parameter



can strongly a�ect the performance, and its value depends both on the speci�c problem, andon maxfev parameter.

• maxfev de�nes the maximum number of function evaluations. If the value is too small, thealgorithm could not �nd the global minimum so maxfev value must be enlarged when asmall value of err is set. We always used the default value of 200 000 but a parametricvalue depending on the dimension of the problem could be chosen.

We �nd out a good procedure to tune all these parameters; this is not a very di�cult taskbecause some of them could be changed in a speci�c way: after choosing err, res maxfev arethen derived; then setting to default value all other parameters, we suggest to �x the size of thetwo tenure lists, next �x the comp and build and �nally tune maxnonimp and loop.

6. AN EXAMPLE OF REAL APPLICATION: DEVICE PARAMETER EXTRACTION

A critical problem for all circuit designers is how to de�ne or extract the values of a complete set ofdevice model parameters to guarantee correct numerical simulation results. Nowadays, all advancedMOSFET models incorporate various physical and electrical e�ects such as short channel, channellength modulation or weak inversion e�ect and use smoothing functions to guarantee a goodbehaviour near transition voltages such as the saturation voltage [14]. As a result, a large numberof process, electrical and empirical parameters are involved in the model equations, possibly in aclosely correlated way.The main goal of device modelling is to determine a model which predicts device behaviour in

a reasonable agreement with experimental (measured) data and device parameter extraction is atypical multiminimal optimization problem on multidimensional continuous space.

There are two main approaches to describe the device behaviour: empirical and physical.Empirical models are analytical expressions that are of a curve-�tting nature with no physi-

cal background. The great advantage of this approach is a short development time but a curve-�tting approach excludes forecasting abilities, geometrical scaling and proper correlation betweenparameters.On the other hand, physical models consist of a set of analytical functions derived from phys-

ical inspection of the device structure, usually valid only in a certain range of bias conditions.For this reason, smoothing procedures of a curve �tting nature are applied to ensure a continu-ous transition from one region to another even if these parameters can introduce annoying localminima. Obviously, a good set of parameters is as important as a good model; unfortunately theset of parameter values is not unique because various device phenomena, as described by certainparameters, cannot always be distinguished clearly form each other in the measured characteristics,so an optimization strategy is mandatory.The parameter extraction problem consists in detecting right values for unknown constants (the

parameters) embedded in devices models. To have a better forecasting of device behaviour wehave to �x these values by comparing real data and simulated data and trying to minimize theerror de�ned by object function

f(p)=Ncurves∑i=1

wiNcurves

Npointsi∑j=1

vijNpointsi

(Idmij − Idsij(p))2



Table VI. Tests with di�erent build and comp.

rosenbrock 2 Build

Comp 1 2 3 40 5545 5766 5647 60671 6846 4383 2249 13832 6069 2687 1196 6863 6255 2500 1213 6704 6192 2319 1093 664

goldstein BuildComp 1 2 3 40 1078 1185 1150 11571 1068 282 340 3492 909 240 338 3033 821 255 296 2984 873 263 323 296

rosenbrock 5 BuildComp 1 2 5 100 9476 10 202 10 329 10 3851 12 277 8357 4503 27962 12 915 6525 3715 23355 13 062 5862 3446 240710 13 322 6507 3510 2343

rosenbrock 10 BuildComp 1 5 10 200 31 295 30 052 31 791 31 1541 36 035 14 888 12 703 98075 35 821 13 067 11 160 910810 34 099 14 154 11 844 857020 35 258 13 727 11 151 9567

Table VII. Tests with di�erent maxnonimp values.

maxnonimp 0 1 2 3 5 7 10rcos 30 30 30 30 30 30 30hartman 3 179 187 139 130 146 135 140dejoung 535 536 336 336 736 536 335rosenbrock 2 2284 2380 2371 2293 2491 2480 2278zakharov 2 77 77 77 78 78 77 78shubert 303 480 446 429 436 448 432goldstein 257 169 193 204 184 201 177easom 253 293 108 348 327 317 333zakharov 5 402 398 399 399 403 403 401hartman 6 371 634 589 444 389 403 335rosenbrock 5 2614 2782 3495 3330 3190 3251 3495zakharov 10 4591 4609 4590 4611 4597 4604 4616rosenbrock 10 11 434 19 647 11 642 11 452 12 240 13 045 11 811



Figure 4. Comparison between measured Id-Vds data (dots) and simulated curves (lines)for 0:35 �m MOSFET with MM9 MOS model.

where p=(p1; : : : ; pN ) is the model parameter vector, wi and vij are weight functions, Ncurves thenumber of used characteristics curves, Npointsi is the number of points which de�nes each curve,Idmi and Idsi the measured and simulated drain current vector of values, respectively. Typical ob-jective function is formulated as the total least-squares errors between the measured and simulateddrain currents and these di�erences can be individually weighted in the computation in order toemphasize the importance of some parameters. Practically, the value of Ncurves depends on thedevice model adopted and K =

∑Ncurvesi=1 Npointsi is determined by the amount of devices subjected

to the parameter extraction work together with the corresponding set of bias condition. In general,Ncurves¡50 and 106 K 6 104.Parameter extraction like other engineering problems is quite di�erent from traditional problems

considered by mathematicians where the function to optimize are often analytical and quite cheap toevaluate.In our case the function to optimize is an expensive �tness function which measures the errorbetween experimental and simulated data, generally it is unknown and it has a high computational cost.These features suggest the use of non-conventional algorithms which do not employ gradient and

can avoid local minima. We used the proposed algorithm as an e�cient and reliable method forgeneral semiconductor device parameter extraction procedures, and we investigated the performanceof DOPE integrated in TYPOO (TinY Parameter Optimizer), a compact extraction tool written in C.We used a set of di�erent measured data to �nd out the parameters of Philips MM9 model for

both single and multiple case n-channel MOS transistors [15]. A large collection of work on MOS-FET device parameter extraction has been performed and quite good results have been obtained.In Figures 4 and 5 we show two examples of the �tted drain current Id for a single 0:35 �m

channel length MOSFET. Curve �tting error of each curve in this case ranges from 0.1 per centto 4 per cent, while the average value is about 2 per cent.

7. CONCLUSIONS

We addressed the question whether it was possible to build a tabu search algorithm for optimizationof continuous problems and we applied it to the problem of parameter extraction for submicrometricdevice models.



Figure 5. Comparison between measured Id-Vgs data (dots) and simulated curves (lines) for 0:35 �mMOSFET with MM9 MOS model, in the small frame the same graphic in logarithmic scale.

Our result is DOPE, a tabu-search-inspired algorithm that follows many tabu search guidelinesbut it is also quite di�erent with the behaviour of tabu lists. Its performance is strongly related toinitial point but it should converge on the global minimum.We think tabu search algorithms are better suited for combinatorial problems, otherwise DOPE

can be tuned to work well on speci�c problems, this encourages further studies about the use oftabu search techniques with continuous domain problems.

ACKNOWLEDGEMENTS

We thank Andrea Lodi for the help and suggestions given while writing this paper.

REFERENCES

1. Glover F. Tabu search—Part I. ORSA Journal on Computing 1989; 1(1):190–206.2. Glover F. Tabu search—Part II. ORSA Journal on Computing 1990; 2(1):4–32.3. Glover F, Laguna M. Tabu Search. Kluwer Academic Publishers: MA, USA, 1997.4. Glover F. Candidate list strategy and tabu search. Technical Report, Center for Applied Arti�cial Intelligence Universityof Colorado Graduate School of Business, July 1989.

5. Battiti R, Tecchiolli G. The reactive tabu Search. ORSA Journal on Computing 1994; 6(2):126–140.6. Battiti R, Tecchiolli R. The continuous reactive tabu search: blending combinatorial optimization and stochastic searchfor global optimization. Annals of Operations Research 1996; 63:153–188.

7. Hu N. Tabu search method with random moves for globally optimal design. International Journal for NumericalMethods in Engineering 1992; 35:1055–1070.

8. Siarry P, Berthiau G. Fitting of tabu search to optimize functions of continuous variables. International Journal forNumerical Methods in Engineering 1997; 40:2449–2457.

9. Chelouah R, Siarry P. Enhanced continuous tabu search: an algorithm for optimizing multiminima functions. In: MetaHeuristics Advances and Trends in Local Search Paradigms for Optimization, Osman Stefan Voss HH, Martello S,Roucairol C. (eds). Kluwer Academic Publishers: Dordrecht, 1999; 49–61.

10. Gerlatt Jr CD, Kirkpatrick S, Vecchi MP. Optimization by simulated annealing. Science 1983; 220:671–680.11. Ingber L. Adaptive simulated annealing (ASA): lessons learned. Control and Cybernetics 1996; 25(1):33–54.12. Dixon ICW, Szeg�o GP. (eds). Toward Global Optimisation 2. North-Holland: Amsterdam, 1978.13. Berthiau G, Siarry P. An enhanced simulated annealing algorithm for extracting electronic component model parametres.

Advances in Engineering Software 1993; 18:171–176.14. de Graa� HC, Klaassen FM. Compact Transitor Modelling for Circuit Design. Springer, Vienna, 1990.15. Klaassen DBM,Velghe RMDA, Klaassen FM. Mos model 9. Technical Report, Philips Nat. Lab. Unclassi�ed Report

NL-UR 003=94, 1994.


a tabu-search-based algorithm for continuous multiminima problems

Documents