[ieee 2010 third international workshop on advanced computational intelligence (iwaci) - suzhou,...

6
Abstract—Particle Swarm Optimization (PSO) has gained much success particularly in continuous optimization. However, like other black box optimizations, PSO lacks an explicit mechanism for exploiting problem specific interactions among variables, which is crucial for discouraging premature convergence. In this paper, we propose two strategies to enhance PSO via probabilistic models. Firstly, we exploit problem structures in PSO to repel premature convergence, where problem specific interactions among variables are represented as a mixture of multivariate normal distributions. Secondly, the authors propose a hybrid constraint handling method for PSO via combining “feasibility and dominance” (FAD) rules with sampling from a mixture of Truncated Multivariate Normal Distributions (mixed TMNDs), where the constraints are restricted to linear inequalities and represented as mixed TMNDs. Results for test problems indicate that the proposed enhancements significantly improve the performance of PSO. I. INTRODUCTION ARTICLE swarm optimization (PSO) algorithms have become increasingly popular in the last few years [1]. Due to the simple concept, easy implementation and quick convergence, nowadays PSO has gained much attention and wide applications in different fields [2]. However, PSO, like other evolutionary black box optimization, lacks a mechanism to explicitly incorporate problem structures, or problem specific interactions among variables. As a result, the canonical PSO which updates each variable independently sometimes provides inferior solutions converging to a local optimum [3]. This problem is even more acute when the dimension of problem space increases. To discourage premature convergence, researchers have designed more sophisticated force generating equations to control particles. Poli et al. [4] explore the possibility of evolving optimal force generating equations using genetic programming. Some researchers resort to gradient information [5,6]. However it requires continuous and derivative object functions that are unlikely available in practice. Others introduced perturbation mechanisms to keep diversity and thus repel premature [7,8]. Unfortunately, these methods are problem dependent, sometimes domain specific knowledge is strongly required, and therefore are not generic for problem solving. When constraints are involved, besides avoiding premature convergence, another key point is to deal with the constraints. A straight forward approach is to convert the constrained Manuscript received March 19, 2010. Fang Du and Tiejun Wu are with the Department of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, China. Yanjun Li is with the School of Information and Electrical Engineering, Zhejiang University City College, 310015, China. optimization problem into a non-constrained optimization problem by adding penalty for violation of constraints [9]. This method requires careful fine tuning of the penalty function parameters as to alleviate premature convergence. Another approach is to preserve feasible solutions and repair infeasible solutions [10]. However, this requires initialization of all particles inside the feasible space, which is hard to achieve for some problems. In this paper, to overcome the abovementioned problems, we propose two strategies to enhance PSO by using probabilistic models. Firstly, we present a generic way to overcome premature convergence by exploiting problem structures. Motivated by building and using probabilistic models in optimization [11], problem specific interactions among variables are modeled as probabilistic models that are automatically learned from data. Specifically, probabilistic models are built from the selected promising particles from the current population in PSO. A mixture of multivariate normal distributions is employed in our research. Thereafter, one portion of new solutions is generated from the learnt probabilistic models to exploit problem structures, and the other portion is generated as in conventional PSO. Promising solutions from both portions are fused as the next generation of PSO, and thus combines both power of implicit and explicit modeling. Secondly, the authors propose a hybrid constraint handling method for PSO, combining “feasibility and dominance” (FAD) rules [8] with sampling additional promising solutions from feasible regions which are represented as probabilistic models. The constraints we focus here are restricted to linear inequalities and are represented as a mixture of Truncated Multivariate Normal Distributions (mixed TMNDs) [12]. The mixed TMNDs capture not only problem structures but also constraint information, and much more competent solutions are sampled afterwards. II. EXPLOIT PROBLEM STRUCTURE IN PSO VIA PROBABILISTIC MODELS A. Particle swarm optimization The PSO simulates the behavior of swarm as a simplified social system. In a PSO system, each particle tries to search the best position with time, each particle adjusts its position in light of its own experience and the experiences of neighbors, including the current velocity, position and the best previous position experience by itself and its neighbors. Therefore, PSO system combines local search method (through self experience) with global search methods (through neighboring experience), attempting to balance exploration and exploitation. For computing the velocity of a particle, we used Enhancing Particle Swarm Optimization via Probabilistic Models Fang Du, Yanjun Li, and Tiejun Wu P 254 Third International Workshop on Advanced Computational Intelligence August 25-27, 2010 - Suzhou, Jiangsu, China 978-1-4244-6337-4/10/$26.00 @2010 IEEE

Upload: truongtu

Post on 27-Mar-2017

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

Abstract—Particle Swarm Optimization (PSO) has gained much success particularly in continuous optimization. However, like other black box optimizations, PSO lacks an explicit mechanism for exploiting problem specific interactions among variables, which is crucial for discouraging premature convergence. In this paper, we propose two strategies to enhance PSO via probabilistic models. Firstly, we exploit problem structures in PSO to repel premature convergence, where problem specific interactions among variables are represented as a mixture of multivariate normal distributions. Secondly, the authors propose a hybrid constraint handling method for PSO via combining “feasibility and dominance” (FAD) rules with sampling from a mixture of Truncated Multivariate Normal Distributions (mixed TMNDs), where the constraints are restricted to linear inequalities and represented as mixed TMNDs. Results for test problems indicate that the proposed enhancements significantly improve the performance of PSO.

I. INTRODUCTION ARTICLE swarm optimization (PSO) algorithms have

become increasingly popular in the last few years [1]. Due to the simple concept, easy implementation and quick convergence, nowadays PSO has gained much attention and wide applications in different fields [2]. However, PSO, like other evolutionary black box optimization, lacks a mechanism to explicitly incorporate problem structures, or problem specific interactions among variables. As a result, the canonical PSO which updates each variable independently sometimes provides inferior solutions converging to a local optimum [3]. This problem is even more acute when the dimension of problem space increases.

To discourage premature convergence, researchers have designed more sophisticated force generating equations to control particles. Poli et al. [4] explore the possibility of evolving optimal force generating equations using genetic programming. Some researchers resort to gradient information [5,6]. However it requires continuous and derivative object functions that are unlikely available in practice. Others introduced perturbation mechanisms to keep diversity and thus repel premature [7,8]. Unfortunately, these methods are problem dependent, sometimes domain specific knowledge is strongly required, and therefore are not generic for problem solving.

When constraints are involved, besides avoiding premature convergence, another key point is to deal with the constraints. A straight forward approach is to convert the constrained

Manuscript received March 19, 2010. Fang Du and Tiejun Wu are with the Department of Control Science and

Engineering, Zhejiang University, Hangzhou, 310027, China. Yanjun Li is with the School of Information and Electrical Engineering,

Zhejiang University City College, 310015, China.

optimization problem into a non-constrained optimization problem by adding penalty for violation of constraints [9]. This method requires careful fine tuning of the penalty function parameters as to alleviate premature convergence. Another approach is to preserve feasible solutions and repair infeasible solutions [10]. However, this requires initialization of all particles inside the feasible space, which is hard to achieve for some problems.

In this paper, to overcome the abovementioned problems, we propose two strategies to enhance PSO by using probabilistic models. Firstly, we present a generic way to overcome premature convergence by exploiting problem structures. Motivated by building and using probabilistic models in optimization [11], problem specific interactions among variables are modeled as probabilistic models that are automatically learned from data. Specifically, probabilistic models are built from the selected promising particles from the current population in PSO. A mixture of multivariate normal distributions is employed in our research. Thereafter, one portion of new solutions is generated from the learnt probabilistic models to exploit problem structures, and the other portion is generated as in conventional PSO. Promising solutions from both portions are fused as the next generation of PSO, and thus combines both power of implicit and explicit modeling.

Secondly, the authors propose a hybrid constraint handling method for PSO, combining “feasibility and dominance” (FAD) rules [8] with sampling additional promising solutions from feasible regions which are represented as probabilistic models. The constraints we focus here are restricted to linear inequalities and are represented as a mixture of Truncated Multivariate Normal Distributions (mixed TMNDs) [12]. The mixed TMNDs capture not only problem structures but also constraint information, and much more competent solutions are sampled afterwards.

II. EXPLOIT PROBLEM STRUCTURE IN PSO VIA PROBABILISTIC MODELS

A. Particle swarm optimization The PSO simulates the behavior of swarm as a simplified

social system. In a PSO system, each particle tries to search the best position with time, each particle adjusts its position in light of its own experience and the experiences of neighbors, including the current velocity, position and the best previous position experience by itself and its neighbors. Therefore, PSO system combines local search method (through self experience) with global search methods (through neighboring experience), attempting to balance exploration and exploitation. For computing the velocity of a particle, we used

Enhancing Particle Swarm Optimization via Probabilistic Models Fang Du, Yanjun Li, and Tiejun Wu

P

254

Third International Workshop on Advanced Computational Intelligence August 25-27, 2010 - Suzhou, Jiangsu, China

978-1-4244-6337-4/10/$26.00 @2010 IEEE

Page 2: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

a frequently adopted force generating equation proposed in [13] where an inertia weight is employed as an improvement:

1 1 ,

2 2 ,

( )( )

ϖ= × + × × −+ × × −

id id best id id

best id id

V V c rand P xc rand G x

(1)

where idV is the velocity of id dimension, 1c and 2c are positive values named learning factors. 1rand and 2rand are random values within the range [0,1] , ϖ is an inertia weight to control the influence of previous velocities on the new velocity. The operator ϖ plays the role of balancing the global search and local search. bestP is the best position of the current particle found so far and bestG is the best position of the best particle found so far.

In (1), 1rand and 2rand are drawn independently for each dimension. Consequently, velocity of each dimension is updated in such a way that interactions among variables are weak. These interactions are implicitly modeled such that the overall velocity adjustment show preference on approaching to neighborhoods of bestG and bestP . However, the implicit interactions modeled are not adequate and show inferior scalability to large space. To overcome this problem, we adopt probabilistic models that explicitly represent problem specific interactions.

B. Representing problem structures with probabilistic models Recently, a number of Estimation of Distribution

Algorithms (EDAs) that guide the exploration of the search space by building probabilistic models of promising solutions found so far have been proposed [11, 14]. The core of EDAs lies in its capability of modeling the problem specific interactions among variables X with a joint distribution

( )P X . Thereafter, new promising solutions are drawn from ( )P X by simulation. In practice, the joint distribution is

approximated and factorized into a product of local distributions in order to reduce sample complexity and the number of parameters to learn. In discrete space, Bayesian networks are frequently adopted as a factorization. In continuous domain, Gaussian networks, histogram and multivariate normal distributions are most common in use [15]. We focus on continuous optimization and use a mixture of multivariate normal distributions in this paper due to its computational efficiency.

Let 1 2{ , , , }= … nX X XX be a set of n continuous variables, and 1 2{ , , , }= … nx x xx be the values of X. Assume that the joint probability density function of X is a multivariate normal distribution ( , )N μ Σ , that is

112 2

12

( ) (2 ) exp ( ) ( )π− − −⎡ ⎤= − − −⎣ ⎦

nTf x x xΣ μ Σ μ (2)

where μ is a n-dimensional mean vector, Σ is a ×n n covariance matrix, Σ is the determinant of Σ . Well-known maximum likelihood estimators exist for the estimation of μ and Σ from sample data [16].

The approach to simulate from the multivariate normal

distribution is based on the Cholesky decomposition. Since the covariance matrix Σ is symmetric and positive definite, it can be decomposed into a unique lower-triangular matrix L with =TLL Σ . Then if 1 2, , ,… nZ Z Z are independently simulated from (0,1)N , we can generate ( , )∼ NX μ Σ by

= +X LZμ (3) If the newly sampled solution x is out of lower or upper bounds, x is adjusted by,

min( , ), max( , )= =i i i i i ix x ub x x lb (4) where ilb , iub are lower and upper bounds, respectively.

A single multivariate normal distribution is suited for single peak landscape. For complex nonlinear landscapes such as multimodal functions, a mixture of multivariate normal distributions is usually employed. That is, the mixture can model the nonlinear dependencies using a combination of piecewise linear interaction models resulting from breaking up the nonlinearity. A mixed multivariate normal distribution is a weighted sum of 1>k multivariate normal distributions, formulated as:

, ,1( ) ( )β

==∑ i i

kii

f fη Σ μ Σx x (5)

such that (1, , )∀ ∈ …i k : 0β ≥i and 1

1β=

=∑ kii

, called

mixing coefficients. , ( )i i

fμ Σ x is defined as in (2). The mixed multivariate normal distribution can be estimated by clustering. For each cluster iC , a multivariate normal distribution , ( )

i ifμ Σ x could be built, and the mixing

coefficient βi is calculated such that it is proportional to the average fitness of each cluster. Sampling from the mixed multivariate normal distribution can be done by firstly selecting one cluster proportional to its coefficient βi using roulette wheel selection, and then the problem reduces to sampling from a single multivariate normal distribution according to (3).

C. Structure enhanced particle swarm optimization We herein propose an algorithm called Structure Enhanced

Particle Swarm Optimization (SEPSO) where problem structures are exploited through building and using probabilistic models. SEPSO is a standard PSO algorithm augmented with structure enhancements. The idea is that, besides generating new solutions from the implicit interactions modeled by PSO in (1), another portion of new solutions are sampled from the learnt probabilistic models where interactions among variables are explicitly represented. This portion of sampled solutions replaces those inferior ones generated by standard flying in PSO. Consequently, SEPSO combines both power of implicit and explicit modeling. To learn the mixed multivariate normal distribution, M promising individuals are first selected from the current population of PSO. Afterwards, those promising individuals are clustered into k clusters using BEND leader algorithm [15] according to its speed and flexibility. Then a number of k multivariate normal distributions are built from each cluster, where means and covariance matrix are calculated by maximum likelihood estimators. The new

255

Page 3: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

offspring are generated using (3). In our algorithm, the proposed structure enhancement is triggered under probability seP .

The above rationale results in the following algorithm: 1. Initialize randomly particle positions and velocities. 2. Calculate bestP for each particle and bestG . 3.Compute the velocity by (1) and update particle

positions. 4. If < serand P Then

(a) Select M promising individuals from PSO population

by using truncation selection [15], where <M population

size of PSO.

(b) Cluster the selected individuals into k clusters and calculate means iμ and covariance matrix iΣ for each cluster .

(c) Generate iN (1=

=∑ kii

N N and >N M ) new individuals from multivariate normal distribution

( , )i iN μ Σ using the Cholesky decomposition method, iN is proportional to the average fitness of each cluster.

(d) Replace N worst particles generated in Step 3 with newly sampled ones in Step 4(c) 5. End if 6. If a termination criterion is not met, go to Step 2.

Procedures in Step 4 make up the structure enhancement part.

III. HANDLE CONSTRAINTS IN PSO VIA FAD RULES AND SAMPLING

In this section the SEPSO algorithm is extended to constrained optimization problems. Many methods were presented to handle constraints. Koziel and Michalewicz [17] grouped them into four categories: methods based on preserving feasibility of solutions; methods based on penalty functions; methods that make a clear distinction between feasible and infeasible solutions; and other hybrid methods. Our proposed algorithm belongs to the last category where a portion of feasible solutions are generated by sampling from mixed TMNDs. These feasible solutions are injected into the other portion of solutions generated by flying mechanisms in PSO. The overall mixed solutions are handled by FAD rules. We focus on constraints that are restricted to linear inequalities and represented as mixed TMNDs

A. Representing linear inequality constraints with truncated multivariate normal distributions Many variables in real-world problems have lower and

upper bounds and the feasible regions of the search space are constrained using linear inequality constraints [18]. In previous section, the problem structures are modeled by multivariate normal distributions where constraints are not considered. In consideration of linear inequality restrictions, problem structures are modeled as a TMND,

( , ) ≤ ≤∼ Nx μ Σ a Dx b (6)

where D is n n× matrix of rank n, individual elements of a and b must be real values. The truncated part reflects the

constrained nature of the search space, while the multivariate normal part represents the problem structure. All solutions that are feasible under the linear inequality constraints have positive probabilities; solutions that are infeasible have a probability of zero. Therefore, the model can be formulated as,

12 2

112

(2 )

exp ( ) ( )( )

,0 ,

π− −

⎧⎪⎪⎪ ⎡ ⎤ ⋅ − − −= ⎨ ⎣ ⎦⎪ ∈⎪⎪ ∉⎩

n

Tf

SS

Σ

μ Σ μx xx

xx

(7)

{ | }= ≤ ≤S x a Dx b In practice, the general nonlinear programming problem is

formulated as: min ( )f x

subject to: ( ) 0, 1, ,≤ = …ig i mx (8)

, 1, ,≤ ≤ = …i i ilb x ub i n (9) where x is the vector of solutions, m is the number of inequality constraints (equality constraints are also transformed into inequalities). In this paper, ( )ig x are restricted to linear, for problems where nonlinear inequalities are presented, we may approximate them with linear-wise constraints and that will be our future research. In order to convert the constraints ( ) 0≤ig x in (8) into the form of

( )≤ ≤i i ia g bx which is required in (6), we herein provide a straightforward method: easily let min ( )=i ia g x ,

min(0, max ( ))=i ib g x , where x subjects to (9). For complex landscapes, a mixed TMNDs model could be

employed that can be described as

, ,1( ) ( )β

==∑ i i

kii

f fη Σ μ Σx x where , ( )

i ifμ Σ x is defined in (7), and βi are mixing

coefficients such that (1, , )∀ ∈ …i k :

10 and 1β β

=≥ =∑ k

i ii

B. Sampling from truncated multivariate normal distributions Sampling from a TMND (6) is much more difficult than its

non-truncated counterpart, and many different methods have been developed. An overview of such methods can be found in the reference [19]. Naive rejection sampling [19] from

( , )N μ Σ can be employed directly in (6), but is impractical in general since the ratio of rejected solutions to accepted solutions is astronomical for many commonly arising problems [12]. In this paper, efficient sampling from a TMND is achieved by sampling from another equivalent one where the Cholesky decomposition can also be applied.

The problem of sampling from (6) is equivalent to sampling from a n-variate normal distribution subjects to linear restrictions,

256

Page 4: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

(0, ) ≤ ≤∼ N α βz T z (10) where ′= ΣT D D , = −α μa D , = −β μb D and we then take 1−= +μx D z .

For the sake of computational efficiency, the algorithm for sampling from (10) is also the Cholesky decomposition, though more sophisticated procedures such as Gibbs sampler in [12] may be adopted. Sampling from a mixed TMNDs could also be settled via reducing the problem to sampling from a single TMND as in previous sections.

For the sake of computational efficiency, the algorithm for sampling from (10) is also the Cholesky decomposition, though more sophisticated procedures such as Gibbs sampler in [12] may be adopted. Sampling from a mixed TMNDs could also be settled via reducing the problem to sampling from a single TMND as in previous sections.

C. The hybrid constraint handling method In our algorithm, infeasible solutions are allowed to

evolve. These infeasible solutions are generated from two parts. Firstly, particles generated by flying mechanisms in PSO are likely infeasible solutions. Secondly, linear constraints in (6) may result from an approximation, e.g. approximation of some nonlinear constraints. Consequently, while solutions sampled from (6) are substantially closed to a feasible region, they are still likely infeasible. The proposal conveyed by this paper is the combination of a constraint handling mechanism (FAD rules) and a sampling procedure. Although several approaches have been proposed to handle constraints, recent results indicate that the simple FAD rules handle them very well [8]. The FAD rules are applied when selecting a leader [7]: 1) for two feasible particles, pick the one with better fitness value; 2) if both particles are infeasible, then the particle that has the lowest value in its total violation of constraints (normalized with respect to the largest violation of each constraint achieved by any particle in the current population) wins; and 3) for a mixed pair of feasible and infeasible particles, the feasible particle wins. The idea is to choose a leader for the particles, even infeasible but closer to the feasible region. In addition, a small number of solutions are generated using 1−= +μx D z where z is sampled from a uniform distribution ( , )U α β . This could be thought as a constraint-oriented perturbation mechanism to keep population diversity.

The overall hybrid algorithm is outlined as follow: 1. Initialize randomly particle positions and velocities. 2. Calculate bestP for each particle and bestG using FAD

rules. 3. Compute the velocity by (1) and update particle

positions. 4. If < serand P Then

(a) Select M promising individuals from PSO population by using truncation selection, where <M population size of PSO

(b) Cluster the selected individuals into k clusters and calculate means iμ and covariance matrix iΣ for each cluster .

(c) Sample J solutions from uniform distribution

( , )U α β as a perturbation.

(d) Generate iN (1=

= −∑ kii

N N J and >N M ) new individuals from truncated multivariate normal distribution

( , ) ≤ ≤i iN μ Σ a Dx b as described in Section III.B, iN is proportional to the average fitness of each cluster.

(e) Replace N worst particles generated in Step 3 with newly sampled ones in Step 4(c), 4(d). Comparison are carried out under FAD rules. 5. End if 6. If a termination-criteria is not met, go to Step 2.

IV. EXPERIMENTS AND DISCUSSION In this section the performance of the SEPSO algorithm is

investigated by comparing it with that of some frequently adopted PSO algorithms. The proposed SEPSO algorithm is applied to the Summation Cancellation (SC) problem where strong interactions among variables are presented, and to a linear inequality constrained problem G01 that is usually used to evaluate the performance of constrained optimization algorithms [7, 8].

A. Experiments on the SC problem The SC problem was proposed by Baluja and Caruana [20]

as

5 10

100max ( )10 | |γ− −

=

=+∑n

i i

f x (10)

where 0 0γ = x , 1γ γ −= +i i ix , [ 3,3]∈ −ix and n is the problem dimension. For an intuitive impression of the characteristics of the problem, a two-dimensional surface plot is provided in Fig1.

This optimization problem has multivariate linear interactions among the problem variables. Thus, to solve this problem effectively and efficiently, an algorithm must be capable of modeling linear dependencies among the variables. Furthermore, the multivariate interaction is very strong since each γ i in the problem definition is defined in terms of all jx with <j i . Finally, the optimum is located at a very sharp peak, which implies that the optimization algorithm needs to have a large precision and be able to prevent premature convergence in order to reach the global optimum.

Two cases of the SC problem (i.e., n=10, 20) are

Fig. 1 Two-dimensional surface plot for the SC problem. The function is shown on a logarithmic scale for a better impression of the problem features.

257

Page 5: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

investigated in this experiment. For the purpose of comparison, parameter settings for Eq.1 are deployed the same in both SEPSO and PSO. 1c and 2c are randomly generated in the range [1.5, 2.5] [7]. Inertia weight ϖ takes a value randomly generated within the range [0.1, 0.5]. All the results were averaged over 100 runs of problem solving. A single peak multivariate normal distribution is used in both cases of the SC problems for simplicity. In the 10-dimensional case, a population size of 100 is set for both SEPSO and PSO. In SEPSO, a statistical model is build from these 40 best solutions, and 50 offspring are generated according to this model using the sampling algorithm. The probability of structure enhancement seP is set to 0.3. In the 20-dimensional case, population size is adjusted to 200, and

100 new solutions are sampled from the learnt model. Fig.2 shows the average function values of the current best

solutions given the number of function evaluations for different size of the SC problems. Here we choose to use the number of function evaluations to estimate the computational effort used by different algorithms, because the computational time of each algorithm is dominated by the time spent in evaluating the objective functions. The plots in

Fig.2 show that the standard PSO fails to conquer the SC problem both in the 10-dimensional and the 20-dimensional cases. The reason is that the standard PSO lacks a mechanism to exploit problem specific interactions among variables, and consequently prematurely terminates. We also try to combine the PSO with a finely designed turbulence operator adopted in [7]. However, it shows that the turbulence fails to work. On the contrary, the SEPSO successfully solve the SC problems and scales gracefully to high dimensions. The success of SEPSO sterns mainly from the structure enhancement part of the algorithm, i.e., SEPSO correctly models interactions among variables and exploits those interactions via sampling to overcome premature convergence.

B. Experiments on the G01 problem The linear inequality constrained problem G01 is a highly

constrained problem with high dimensionality, defined as follows:

1321 1 5

min ( ) 5 54 4

= = == − −∑ ∑ ∑i i ii i i

f x x xx subject to:

1 1 2 10 11( ) 2 2 10 0= + + + − ≤g x x x xx 2 1 3 10 12( ) 2 2 10 0= + + + − ≤g x x x xx 3 2 3 11 12( ) 2 2 10 0= + + + − ≤g x x x xx 4 1 10( ) 8 0= − + ≤g x xx 5 2 11( ) 8 0= − + ≤g x xx 6 3 12( ) 8 0= − + ≤g x xx 7 4 5 10( ) 2 0= − − + ≤g x x xx 8 6 7 11( ) 2 0= − − + ≤g x x xx 9 8 9 12( ) 2 0= − − + ≤g x x xx

where the bounds are 0 1 ( 1, ,9)≤ ≤ = …ix i , 0 100 ( 10,11,12)≤ ≤ =ix i and 130 1≤ ≤x . The global optimum is at * (1,1,1,1,1,1,1,1,1,3,3,3,1)=x where

( *) 15= −f x . Constraints 1 2 3 4 5, , , ,g g g g g and 6g are active.

Fig 3 shows the average function values of the current best solutions given the number of function evaluations for the G01 problem. In SEPSO, population size is adjusted to 200. 60 promising solutions are selected to build mixed TMNDs. The BEND leader algorithm (with a threshold of 0.3) is used as the clustering algorithm due to its speed and flexibility. 120 new solutions are sampled from the mixed TMNDs. Among them, ten percent solutions are sampled by the constraint-oriented perturbation mechanism. Our experiments indicate that the standard PSO using FAD rules fails to find the global optimal and gets trapped into local minimal very quickly.

We choose a state-of-the-art PSO [7], where FDA rules and a well-designed turbulence operator are employed, for a comparison. The plot in Fig.3 shows SEPSO successfully found the global minimal using much less number of function evaluations than PSO in [7]. In standard PSO, once a feasible solution is found, all other infeasible solutions fly towards it, resulting in a quick premature convergence. Thus a turbulence operator is strongly required in standard PSO. However, the turbulence in [7] is not constraint-oriented, i.e.,

(a) n=10

(a) n=20

Fig.2 The plots show the average function values of the current best

solutions over the number of function evaluations, where the function

value is shown on a logarithmic scale. Results are presented for the SC

function of size n=10 (Fig.2 (a)), n=20 (Fig.2 (b)), for both version of

SC, the global maximum is 710 . The results show that SEPSO

outperforms a standard PSO, independent of the problem size.

258

Page 6: [IEEE 2010 Third International Workshop on Advanced Computational Intelligence (IWACI) - Suzhou, China (2010.08.25-2010.08.27)] Third International Workshop on Advanced Computational

the turbulence operator doses not consider constraint information at all. By contrast, the constraint handling mechanism in SEPSO is more powerful and effective. The mixed TMNDs capture not only problem structures but also constraint information, and therefore more promising solutions are sampled afterwards. Consequently, the SEPSO outperforms the standard PSO with a turbulence operator.

V. CONCLUSIONS This paper has presented two strategies to enhance PSO for

continuous optimizations. Both enhancements were achieved by employing probabilistic models. Firstly, to discourage premature convergence of PSO, the authors suggest exploiting problem structures via building and using probabilistic models. We formulate this idea into the SEPSO algorithm which combines both the power of implicit and explicit modeling. Preliminary results on the SC problems show that the SEPSO successfully repels premature convergence and scales gracefully to high dimensions. Secondly, we extend the SEPSO to constrained optimizations, where linear inequality constraints and problem structures are modeled into mixed TMNDs. Preliminary result on a highly constrained optimization problem shows the proposed method outperforms some start-of-the-art versions of PSO using much less number of function evaluations.

Certainly, there is much work to be done further in our research. In our future work, this algorithm will be tested for a larger variety of problems. The work presented in this paper is focused on linear inequality constraints. Optimization problems with nonlinear constraints may be explored by approximate nonlinear constraints with linear-wise constraints, and that will be our future research. Nonetheless, the proposed two strategies are demonstrated as promising enhancements.

REFERENCES [1] J. Kennedy J, R. C. Eberhart, and Y. Shi, Swarm Intelligence. Morgan

Kaufman, UK, 2001. [2] R. C. Eberhart and Y. Shi, “Particle swarm optimization:

developments, applications and resources,” in Proc. IEEE Congress on Evolutionary Computation, Seoul, Korea, pp. 81-86, 2001

[3] P. J. Angeline, “Evolutionary optimization versus particle swarm optimization: philosophy and performance differences.,” in Proc. 7th International Conference on Evolutionary Programming VII, pp. 601-610, 1998

[4] R. Poli, W. B. Langdon, and O. Holland, “Extending particle swarm optimization via genetic programming,” in Proc. European Conference on Genetic Programming, Lausanne, Switzerland, pp. 291-300, 2005

[5] T. A. Victoire and A. E. Jeyakumar, “Hybrid PSO–SQP for economic dispatch with valve-point effect,” Electric Power Systems Research, vol. 71, no. 1, pp. 51-59, 2004

[6] Q. J. Guo, H. B. Yu, and A. D. Xu. “A hybrid PSO-GD based intelligent method for machine diagnosis”. Digital Signal Processing, vol. 16, no. 4, pp. 402-418, 2006

[7] G. T. Pulido and C. A. C. Coello, “A constraint handling mechanism for particle swarm optimization,” in Proc. IEEE Congress on Evolutionary Computation, Oregon, Portland, pp. 1396-1403, 2004

[8] A. E. M. Zavala and A. H. Aguirre, “Constrained optimization via particle evolutionary swarm optimization algorithm.,” in Proc. Genetic and Evolutionary Computation Conference, Washington, USA, pp. 209-216, 2005

[9] K. E. Parsopoulos and M. N. Vrahatis, “ Particle swarm optimization method for constrained optimization problems,” in Proc. Euro-International Symposium on Computational Intelligence, Slovakia, 2002

[10] X. Hu and R. C. Eberhart, “ Solving constrained nonlinear optimization problems with particle swarm optimization,” in Proc. World Multiconference on Systemics, Cybernetics and Informatics, Orlando, USA, 2002.

[11] M. Peliken, D. E. Goldberg , and F. G. Lobo, “A survey of optimization by building and using probabilistic models,” Computational Optimization and Applications, vol. 21, no. 1, pp. 5-20, 2002

[12] J. Geweke, “Efficient simulation from the multivariate normal and student-t distributions subject to linear constraints and the evaluation of constraint probabilities,” Tech. Rep., University of Minnesote, Dept. of Economics, 1991

[13] Y. Shi and R. C. Eberhart, “A modified particle swarm optimizer.,” in Proc. IEEE Congress on Evolutionary Computation, Piscataway, NJ, P. pp. 69-73, 1998

[14] P. Larranaga and J. A. Lozano, Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation, Kluwer Academic Publishers, 2001.

[15] P. A. N. Bosman, “Design and application of iterated density estimation evolutionary algorithms,” Ph. D. Thesis, Utrecht University, TB Utrecht, The Netherlands, 2003.

[16] S. Kotz, N. Balakrishnan, and N. L. Johnson, Continuous Multivariate Distributions,Vol. 2 of Wiley Series in Probability and Statistics, John Wiley and Sons, 2000

[17] S. Koziel and Z. Michalewicz, “Evolutionary algorithms, Homomorphous mappings, and constrained parameter optimization,” Evolutionary computation. vol. 7, no. 1, pp. 19-44, 1999

[18] Z. Michalewicz, K. Deb, M. Schmidt, et al. “Towards understanding constraint-handling methods in evolu-tionary algorithms,”. in Proc. IEEE Congress on Evolutionary Computation, Washington, USA, pp. 581- 588, 1999

[19] V. A. Hajivassiliou, D. L. McFadden, and P. A. Ruud, “Simulation of multivariate normal orthan probabilities: Methods and programs,” Tech. Rep., M.I.T., 1990.

[20] S. Baluja and R. Caruana, “Removing the genetics from the standard genetic algorithm,” in Proc. 12th International Conference on Machine Learning, Madison, Wisconsin, pp. 38- 46, 1995

Fig. 3 The plot shows the average function values of the current best solutions over the number of function evaluations. Results are presented for the G01 problem where the global minimum is -15 .The results show that SEPSO outperforms a standard PSO with a turbulence operator.

259