[IEEE 2013 Sixth International Conference on Advanced Computational Intelligence (ICACI) - Hangzhou, China (2013.10.19-2013.10.21)] 2013 Sixth International Conference on Advanced Computational Intelligence (ICACI) - An effective improvement of JADE for real-parameter optimization

Download [IEEE 2013 Sixth International Conference on Advanced Computational Intelligence (ICACI) - Hangzhou, China (2013.10.19-2013.10.21)] 2013 Sixth International Conference on Advanced Computational Intelligence (ICACI) - An effective improvement of JADE for real-parameter optimization

Post on 09-Mar-2017

214 views

Category:

Documents

2 download

Embed Size (px)

TRANSCRIPT

<ul><li><p>2013 Sixth International Conference on Advanced Computational Intelligence October 19-21,2013, Hangzhou, China </p><p>An Effective Improvement of JADE for Real-parameter Optimization </p><p>Chunjiang Zhang and Liang Gao </p><p>Abstract-Although the metaheuristics cannot guarantee to find the optimum for global optimization, they are efficient </p><p>indeed, especially for the problems very difficultly be optimized </p><p>by traditional optimization methods. Differential evolution </p><p>algorithm is one of the most competitive metaheuristics and the </p><p>adaptive DE with optional external archive (JADE) is an </p><p>excellent DE-variant. Based on the analysis of shortcomings of </p><p>JADE, an effective improvement of JADE is put forward in this </p><p>paper. Two parameters in JADE can be reinitialized and two </p><p>new mutation strategies are added in the improved JADE. 28 </p><p>benchmark problems for competition on real-parameter single </p><p>objective optimization in 2013 IEEE Congress on Evolutionary </p><p>Computation (CEC 2013) are used to test the performance of </p><p>the proposed algorithm. The results compared with DE/rand/l </p><p>and original JADE shows the improvement is effective. </p><p>Keywords-Differential evolution; Improved JADE; Real-parameter optimization </p><p>I. INTRODUCTION </p><p>DDifferential evolution (DE) is a powerful and simple stochastic algorithm for real-parameter optimization. DE has become a bright star in the sky of </p><p>nature-inspired metaheuristics since it was first put forward by R. Storn and K. V. Price in 1995 [1]. And DE or DE-based algoritluns were always on the top list in the former IEEE Congress on Evolutionary Computation (CEC) competitions. For example, classical DE ranked second and SaDE (self-adaptive DE) ranked third on lO-D problems in 2005 CEC competition on real parameter optimization. In the review paper [3], four reasons why the researchers have been looking at DE as an attractive optimization tool were pointed out. They are as follows: 1) DE is simple; 2) DE is powerful; 3) The number of control parameters in DE is very few; 4) The space complexity of DE is low. For higher accuracy and efficiency, many DE-variants such as SaDE [4], jDE [5], DEGL (DE with global and local neighborhoods) [6], JADE (Adaptive DE with optional external archive) [7], CODE (DE with composite trial vector generation strategies and control parameters) [8] were proposed in recent years. Although the modified DE-variants are more powerful than classical DE in some respects, there are still shortcomings in them and there is still some space for improvement. This paper focuses on JADE. After analyzing the drawbacks of JADE, a simple, straightforward and effective improvement is made (IJADE). </p><p>This research work is supported by the National Basic Research Program of China (973 Program) under grant No. 201lCB706804 and the Natural Science Foundation of China (NSFC) under Grant no. 51121002 </p><p>The authors are with State Key Laboratory of Digital Manufacturing Equipment &amp;Technology, Huazhong University of Science and Technology, Wuhan 430074, PR China. (e-mails: zh chj@gg.com; gaoliang@mail.hust.edu.cn ) </p><p>978-1-4673-6343-3113/$31.00 2013 IEEE 58 </p><p>And the benclunark problems for competition on real-parameter single objective optimization in CEC 2013 are used to test the performance of our improved algoritlun. </p><p>The remainder of this paper is organized as follows: DE and JADE are introduced in section Error! Reference source not found . . Section Error! Reference source not found. provides the analysis of the drawbacks of JADE and the improvement of JADE. Experimental results are presented and discussed in section Error! Reference source not found . . In section Error! Reference source not found., conclusions are drawn. </p><p>II. DE AND JADE </p><p>A. DE </p><p>Like other evolutionary algorithms, DE has a population with size of NP individuals. Each individual is a D-dimensional vectors representing the candidate solutions. The subsequent generation in DE is denoted by G = 0, 1, ... , Gmax. The ith individual of the population at the current generation is denoted as </p><p>-</p><p>X,.G = [xi,l.G' Xi.2.G' , X,.D,G] (1) </p><p>At the beginning of DE, an initial population is generated by uniformly randomizing individuals within the feasible search space. For example, the jth component of the ith individual is generated at G = 0 as </p><p>Xi,j,O = xj,min + randi,} (0, l).(xj,max - xj,min) (2) </p><p>where Xj,min and xj,max is the minimum and maximum bound </p><p>value at jth dimension and rand;! (0, 1) is a uniformly </p><p>distributed random number lying between 0 and 1. After initialization, DE contains three steps at each </p><p>generation G: mutation, crossover and selection, In mutation, DE creates a mutant vector </p><p>Vi,G = (Vi,I,G' Vi,2,G' , . , Vi,D,G) </p><p>for each population member </p><p>X"G. The five most frequently used mutation strategies are </p><p>listed as follows. </p><p>1) "DE/randl1" : v"G = X/l,G + F; -( X/2,G -Xr3,G ) (3) </p><p>2) "DE/bestl1" : v"G = Xbesl,G + F; -(Xrl,G -Xr2,G) (4) </p><p>3) "DE/current-to-bestl1": </p><p>V"G =X"G +F;-(XbeS"G -x"G)+F;-(Xrl,G -X/2,G) (5) </p><p>4) "DE/best/2": </p><p>Vi,G = Xbesl,G + F; -(Xrl,G -X/2,G) + F; -(X/3,G -Xr4,G) (6) </p></li><li><p>5) "DE/rand/2": </p><p>V"e = Xrl,e + p, .(Xr2,e -Xr3,e) + p, .(Xr4,e -Xr5,e) (7) </p><p>In the above equations, rl, r2, r3, r4, and r5 are distinct integer randomly selected from the range [1,NP] and are also </p><p>different from i, Xbesl,G is the best individual in the current population, In classic DE, the parameter F, = F is a positive fixed parameter which is called the scaling factor for </p><p>amplifying the difference vectors (Xrl,G -Xr2,e) etc" While </p><p>in many improved DE-variants, taking JADE for example, each individual i has its own scaling factor F;, </p><p>After mutation, a crossover operator is applied to X',G and </p><p>V',G to generate a trial vectoru"G = (u"I,e,ui,2,e",u,,D,e) </p><p>The DE family can use two kinds of crossover schemes which are exponential and binomial crossover respectively. In this paper, the binomial crossover is only used. Under the scheme of binomial crossover, the trial vector is obtained as </p><p>{V,,J,G' if rand',J (0, 1) CR, or J = Jm"" u. = "j</p><p>,G x',J,G' otherwise </p><p>(8) </p><p>where i = 1, 2, . . . , NP , J = 1, 2, . . . , D , Jm"" is a randomly </p><p>chosen integer in [I ,D ], rand,)O,I) is a uniformly distributed random number between 0 and 1. In many adaptive DE variants, CR; is associated each individual and it may be varied at different generation. </p><p>The selection operator is performed to determine whether the target or the trial vector survives to the next generation after crossover. For the minimization problem, it is expressed as follows: </p><p>- {"G' if f(',G) f(',G) Xi,G+l = _ </p><p>X"G, otherwise (9 ) </p><p>The above three steps are repeated generation after generation until a termination criterion is satisfied. </p><p>B. JADE </p><p>JADE is a DE variant by implementing a new mutation strategy "DE/current-to-pbest" with optional external archive and updating control parameters in an adaptive manner. Its earlier version was presented by Zhang and Sanderson proposed in CEC 2007 [9 ]. The journal article of JADE was published in 2009 [7]. JADE is very competitive among the DE variants. In the article of another DE variant CoDE [8], the experimental results of five DE variants (including JADE, jDE, SaDE, EPSDE and CoDE) on the benchmark problems of CEC 2005 for real-parameter single objective optimization showed that JADE ranked second and it was just slightly inferior than CoDE. The key points of JADE are introduced as follows. </p><p>1) DElcurrent-to-pbestI1 In JADE, a new mutation strategy, named </p><p>DE/current-to-pbestll was put forward. Actually, DE/current-to-pbestll is generalized from DE/current-to-bestil in order to enhance its global </p><p>59 </p><p>exploration ability. The new mutation strategy has two schemes. In the first one without optional archive, a mutation vector is generated as - - -p - - -Vi.G =Xi.G +p'.(Xbesl,G -Xi.G)+p'(Xrl.G -Xr2,G) (8) </p><p>-p where Xbesl,G is randomly chosen from the top 100p% </p><p>individuals with p E (0,1] . Each individual i has its own F; </p><p>which is updated at each generation through an adaptive manner introduced later. Comparing Equation (5) and </p><p>Equation (8), the only difference is that Xbes',G has a </p><p>superscript p in Equation (8). If 100p% equals l,the two mutation strategies are the same. </p><p>In the second scheme DE/current-to-pbestll with archive, a mutation vector is generated in the following manner: - - -p - - -Vi.G =Xi.G +p'.(Xbesl,G -Xi,G)+p'(Xrl.G -Xr2,G) (9 ) </p><p>where Xr2,G is randomly selected from the union of current </p><p>population P and an archive population A. The archive population A is initiated to be empty, Then, after each generation, the parent solution that fail in the selection process are added to the archive. If the size of the archive A exceeds a given threshold, some individuals are randomly removed from the archive. </p><p>2) Parameter Adaptation At each generation, the scaling factor F; of each individual </p><p>Xi.G is generated according to a Cauchy distribution with </p><p>local parameter JiF and scale parameter 0. 1 </p><p>(10) </p><p>and then set to be 1 if F; &gt; l or regenerated if F; &lt; 0 . SFi is denoted as the set of successful scaling factors in each generation. JiF is updated at the end of each generation as </p><p>follows: </p><p>JiF=(1-c).JiF + c.meanL(SF) (11) </p><p>where meanL (SF) is the Lehmer mean </p><p>LFES,F2 meanL (SF) =" </p><p>F' </p><p>L....FESF (12) </p><p>The crossover probability CRt of each individual i is </p><p>generated according to a normal distribution of mean JiCR and standard deviation 0. 1 </p><p>CRt = randn, (JiCR, 0. 1) (13) </p><p>then truncated to [0,1]. The mean JiCR is initialized to be 0. 5 </p><p>and updated at the end of each generation as </p><p>JiCR = (1-c).JiCR + cmeanA (SCR) (14) </p><p>where SeR is the set of all successful crossover probabilities </p><p>CRt at current generation. </p></li><li><p>III. THE IMPROVED JADE </p><p>JADE is an outstanding algorithm, there is room for improvement, however. The analysis of the deficiencies of JADE and the improvement is presented below. </p><p>A. Reinitialization of JiF and JiCR </p><p>Firstly, the parameter adaptation mechanism might fail when the problem is very hard to be optimization. If none of individual can be updated at a generation, the set of successful scaling factors SF and the set of successful crossover probabilities SCR will be empty. Then JiF and JiCR cannot be </p><p>updated as well. This situation has not yet been considered in </p><p>[7]. However, when solving the real-parameter single objective benchmark functions of CEC 2013 by JADE, this situation happen often, especially for multimodal functions and composition functions. For example, because some composition functions have different properties around different local optimal, different parameters are needed. The probability of generating a less than 0. 6 CRi value is just 0. 00 l3 under the normal distribution of the mean JiCR = 0. 9 </p><p>and the standard deviation O. l. And the probability of generating a less than 0. 5 CRi value is near zero. If JiCR is </p><p>0. 9 and it cannot be updated unless a CR, value can be less than 0. 5, the JiCR can never be updated and the result cannot </p><p>be improved any more. In order to avoid this situation, in the </p><p>improved JADE, we reinitialize JiCR and JiF when certain </p><p>conditions are met. The details of the implementation are shown in the last part of this section. </p><p>B. New Added Mutation Strategies </p><p>Secondly, there is only one mutation strategy in JADE. It is an undeniable fact that the DE/current-to-pbestll mutation strategy in JADE cannot be replaced by other frequently used mutation strategies such as Equation (3) to (7). This does not mean the new mutation strategy can beat any others on all problems. Obviously, if JADE fails on a problem, it probably performs better when another mutation strategy is adopted. It is known that many DE variants (like SaDE [4], EPSDE[lO] and CoDE [8]) which use a few mutation strategies have good performance. So we try to add more mutation strategies into JADE. The first mutation strategy added into JADE is DE/randll which is the most commonly used strategy in the literature. It has no bias to any special search directions, which leads to better perturbation than the DE/current-to-pbestil. The second mutation strategy added into JADE is the DE/rand/2/dir [11] that incorporates the objective function information to guide the direction of the mutation vector as </p><p>- -</p><p>V;.G = Xrl.G + F; I 2-(Xrl.G - Xr2.G -Xr3.G) (15) where r 1, r2, r3 are distinct integer different from i randomly selected from the range [1,NP] and they are subjected to </p><p>- - -</p><p>f(x,I,G ) &lt; {I(X,2,G ), f(Xr3,G )} . According to the conclusion </p><p>made by Mezura-Montes et al. in [12] where eight different DE-schemes were compared over a test-suite of 13 benchmark problems, DEirand/2/dir remained most competitive and slightly faster to the global optimum on </p><p>60 </p><p>multimodal and non-separable functions. Another reason why it is the chosen one is that when all the individuals are almost at the same position, only DE/rand/2/dir can be used to generate a new mutation vector. </p><p>C. Imp/emetation of the Improved JADE </p><p>In the improved JADE, MSch is a parameter for determining the mutation strategy. Three values 0, 1 and 2 which represent the three mutation strategies can be chosen for MSch. MSch is initialized as 0, which means DE/current-to-pbestil is reserved at the beginning. And the parameter adaption mechanism is still used in JADE. If at least one individual is updated at every generation, HADE is JADE itself. </p><p>In order to determine when to the use the reinitialization of JiF and JiCR and the new mutation strategies, three NoFl, </p><p>NoF2, NoF3 are added into HADE. The three control variables have corresponding thresholds THl, TH2 and TH3. NoFI is the number of generations with successive update failing. Only when NoFI reaches its corresponding thresholds THl, the reinitialization of JiF and JiCR is </p><p>executed. Three values 0. 167, 0. 5 and 0.833 can be selected randomly for reinitialization. NoF2 records the times of NoFI reaching THI. If NoF2 reaches its threshold TH2, a value different the current one will be selected for MSch. For example, if the MSch = 0, it will be set as 1 or 2 randomly, which means DE/randll or DE/rand/2/dir will become the new mutation strategies. Seemly, NoF3 records the times of NoF2 reaching TH2. If NoF3 reaches TH3, this situation means the reinitialization of JiF and JiCR and new mutation </p><p>strategies have no effect. In this case, it is likely that the popUlation get stuck in local optimum. So we will reinitiate the population in HADE. </p><p>The pseudo code of the improved JADE with archive is shown as follows. It is based on the original version in [7]. The major differences lie in from line 08 to line 13 and from line 34 to line 46. </p><p>Line# Procedure of Improved JADE with Archive </p><p>01 Begin 02 Set JiCR = 0. 5; JiF = 0. 5; NoFI =NoF2=NoF3=O 03 Set MSch = 0; A = 0 ; </p><p>04 Create a random population {Xi,O I i = 1, 2, . . . , NP} 05 For G = 1 to Gmax 06 SF=0; SCR =0; 07 Fori=l to NP 08 IfMSch = 0 09 </p><p>- - - p - - -V;,G = X;,G + F; -(Xbesl,G -x;,G ) + F; -(Xrl,G -Xr2,G) </p><p>10 Else if MSch = 1 -</p><p>11 V...</p></li></ul>