[ieee 2011 fourth international workshop on advanced computational intelligence (iwaci) - wuhan,...

A Neural Network Algorithm for Solving Generalized ConvexProgramming

Jingli Yang and Tingsong Du

Abstract— In this paper, by using 0.618 method, a novelneural network algorithm is proposed. This neural networkcan solve generalized convex programming subject to linearconstraints. Finally, examples are provided to show the appli-cability of the proposed neural network algorithm.

I. INTRODUCTION

O PTIMIZATION problem is an important part of oper-

ations research, which has many applications in nat-

ural science, social science, productive practice, engineering

design and modern management[1]. As an important branch

of the operations research, the mathematical programming

has more profound influence. For nearly three decades,

convexity theory has been widely applied in the fields

of optimization[2]-[3], while in many cases, the objective

function is convex is only the sufficient conditions rather than

the necessary conditions for mathematical programming. At

the same time, many important properties were discussed

in convex analysis only need the level set of the function

is convex, it means that the function is generalized convex

function[4]-[5]. So the generalized convexity has become a

new development trend in mathematical programming. In

1970, B.DeFinetti[6]-[7] first studied the function which level

set is convex set, he found this kind of functions includ-

ing all convex functions and some nonconvex functions.

W.Fenchel[8] is the first scholar who called it quasicon-

vex function, and researched its properties systematically.

M.Slater has extended Kuhn-Tucker saddle point theory

and applied it to generalized convex programming[6]-[7],

K.J.Arrow and A.C.Enthoven in [9] studied quasiconcave and

applied it in economy, since then, more and more scholars

engaged in generalized convexity and made a series of impor-

tant achievements, such as J.A.Ferland, J.P.Crouzeix[10]and

X.M.Yang[11]-[13].

It is well known that neural network is a parallel, dis-

tributed information processing structure consisting of pro-

cessing elements. Each processing element has a single

output connection which branches into as many collateral

connections as desired, which makes neural network has

strong learning ability and can implement large-scale par-

allel computing. Neural network provided a new approach

for optimization problem and had been received consider-

able attention[14]-[15]. Compared with traditional numer-

ical methods, the neural network approach can solve the

optimization problems much faster in running time[16]. In

Jingli Yang and Tingsong Du are with Institute of Nonlinear and ComplexSystems, China Three Gorges University, Yichang, Hubei, 443002, China(jennyyang [email protected](J.L. Yang)).

1985, Hopfield and Tank first proposed a neural network

for solving Traveling Salesman Problem(TSP)[17], after that,

many neural network models are proposed and used in linear

programming and nonlinear programming, while most of

the objective function is convex[18]-[20]. In this paper, by

employing 0.618 method and the characteristic of common

network topology, we present a neural network algorithm

for solving the generalized convex programming with linear

constraints.

The organization of this paper is as follows. In Section II,

we introduce the theoretical foundation about convex analysis

and 0.618 method. In Section III, we present the neural

network based on 0.618 method. The neural network algo-

rithm is proposed in Section IV. In Section V, two examples

are discussed to evaluate the effectiveness of the proposed

neural network algorithm. Finally, Section VI concludes the

conclusions.

II. PRELIMINARIES

A. Generalized Convexity

The research on convexity and generalized convexity is

one of the most important aspects of mathematical pro-

gramming. In this section, we give some properties and

relations about convex, quasiconvex, strictly quasiconvex,

pseudoconvex functions.

Definition 1: A real-valued function f(x), defined on a

nonempty convex set Ω of Rn, is said to be quasiconvex if

f(λx1 + (1 − λ)x2) ≤ max{f(x1), f(x2)}, ∀λ ∈ [0, 1],

for every x1 ∈ Ω, x2 ∈ Ω.

A real-valued function f(x), defined on a nonempty con-

vex set Ω of Rn, is said to be strictly quasiconvex if

f(λx1 + (1 − λ)x2) < max{f(x1), f(x2)}, ∀λ ∈ (0, 1),

for every x1 ∈ Ω, x2 ∈ Ω, f(x1) �= f(x2).Definition 2: A real-valued function f(x), defined on a

nonempty convex set Ω of Rn, is called level set if

Hr(f) = {x|x ∈ Ω, f(x) ≤ r},for r ∈ [−∞,+∞].

Lemma 1 [20]: A function f : Ω → R is called

quasiconvex if and only if its domain and all its sublevel

sets Hr(f) = {x|x ∈ Ω, f(x) ≤ r},∀r ∈ R, are convex

sets.

Remark 1: Convex functions have convex sublevel sets,

and so are quasiconvex. But simple examples, such as the

one shown in figure 1, show that the converse is not true.

Fourth International Workshop on Advanced Computational Intelligence Wuhan, Hubei, China; October 19-21, 2011

978-1-61284-375-9/11/$26.00 @2011 IEEE

1970, B.DeFinetti[6], [7] first studied the function which level

and applied it to generalized convex programming[6], [7],

able attention[14], [15]. Compared with traditional numer-

function[4], [5]. So the generalized convexity has become a

325

50 100 150 200 250 300 350 400 450 500

50

100

150

200

Fig.1. A quasiconvex function on R.

For each α, the α-sublevel set Hα is convex, i.e., an

interval. The sublevel set Hα is the interval [a, b]. The βsublevel set Hβ is the interval (−∞, c].

Definition 3: A real-valued function f(x), defined on a

nonempty convex set Ω of Rn, is said to be pseudoconvex

when

∇f(x1)T (x1 − x2) ≤ 0, ∀x1, x2 ∈ Ω,

we can get f(x1) ≤ f(x2).A real-valued function f(x), defined on a nonempty con-

vex set Ω of Rn, is said to be strictly pseudoconvex when

∇f(x1)T (x1 − x2) ≤ 0, ∀x1, x2 ∈ Ω, x1 �= x2

we can get f(x1) < f(x2).Lemma 2 [20]: A function f : Ω → R is differentiable,

if f is pseudoconvex on Ω, then we can get f is strictly

quasiconvex and quasiconvex on Ω.

We consider the following optimization programming

problem:

(GCP)

{minimize f(x)subject to gi(x) ≤ 0, i = 1, 2, . . . , m.

if the feasible set of (GCP) is convex set, noted F , and f(x)is (strictly) quasiconvex or (strictly) pseudoconvex on F , then

(GCP) is said to be generalized convex programming.

Theorem 1: Let F be a nonempty convex set, f(x) is

strictly quasiconvex on F , then the local optimal solution of

(GCP) is also its globally optimal solution.

Proof: By contradiction, suppose that x∗ isn’t the globally

optimal solution of (GCP), then existing x̄ ∈ F , x̄ �= x∗ such

that

f(x̄) < f(x∗).

As f(x) is strictly quasiconvex function, then for ∀λ ∈ (0, 1),we have

f(λx̄ + (1 − λ)x∗) < max{f(x̄), f(x∗)} = f(x∗). (1)

When λ is small enough, then λx̄ + (1 − λ)x∗ ∈F

⋂N(x∗, ε), ε > 0. Therefore, (1) is contradict to x∗ is

the local optimal solution of (GCP).

Definition 4: If f(x) is quasiconvex and isn’t convex, we

called f(x) is only quasiconvex function.

In order to judge whether the second-order function is

quasiconvex function, we quote following lemma.

Lemma 3 [4]: Consider f(x) = 12xT Ax + cT x, where

x = (x1, x2, · · · , xn)T ∈ Rn, c ∈ Rn, A ∈ Rn×n is a

real symmetric matrices, if A and the Hessian ∇f2(x) only

have a negative eigenvalue, then f(x) is only quasiconvex.

∇f2(x) is defined as

∇f2(x) =[

0 ∇f(x)T

∇f(x) 0

].

B. Description of the 0.618 Method [21]

The 0.618 method is used to solve one dimension op-

timization problems, which is require no derivative. The

objective function can be convex or quasiconvex. Its main

characteristic is use α = 0.618 and 1 − α = 0.382 after the

search interval is gained. We can use the following step to

solve the optimization problems:

Step 1: Give the constrain condition a1 ≤ x ≤ b1 and the

iteration precision ε, we set

λ1 = a1 + (1 − α)(b1 − a1)= αa1 + (1 − α)b1 = 0.618a1 + 0.382b1,

μ1 = a1 + α(b1 − a1)= (1 − α)a1 + αb1 = 0.382a1 + 0.618b1,

put k = 1.Step 2: If |bk − ak| < ε, end. The optimal solution x∗ ∈

[ak, bk], let x∗ = (ak + bk)/2, otherwise, calculate f(λk),f(μk). If f(λk) > f(μk), turn to step 3, otherwise, turn to

step 4.

Step 3: Let ak+1 = λk, bk+1 = bk, and let

λk+1 = μk,

μk+1 = ak+1 + α(bk+1 − ak+1) = 0.382ak+1 + 0.618bk+1,

calculate f(μk+1).Step 4: Let ak+1 = ak, bk+1 = μk, and let

λk+1 = ak+1 + (1 − α)(bk+1 − ak+1)= 0.618ak+1 + 0.382bk+1,

μk+1 = λk,

calculate f(λk+1). Step 5: Let k = k + 1, turn to Step 2.

III. NEURAL NETWORK STRUCTURE

According to 0.618 method, we design the neural network

as follow:

50 100 150 200 250 300 350 400 450 500

50

100

150

Fig.2. Neural network structure based on 0.618 method

326

We suppose that:Ni,j means the neuron j in layer i,neti,j denote as the input value of neuron Ni,j , Oi,j means

the output value of neuron Ni,j , and ω(i,j),(k,l) means the

connection weight from Ni,j to Nk,l. Biasing threshold

vector is defined by θ = 0.

The neural network structure in Fig.2 contains 6 layers,

including input layer, three hidden layers, one feedback layer

and output layer. We arrange calculation steps below to solve

quadratic programming.

(I) Input Layer: Let [ak, bk] as the input of N0,1 and N0,2,

then

net0,1 = ak = a1,

net0,2 = bk = b1,

the activation functions are defined by ϕ0,1(x) = x,

ϕ0,2(y) = y, then the outputs of N0,1 , N0,2 are

O0,1 = ϕ0,1(net0,1) = a1,

O0,2 = ϕ0,1(net0,2) = b1.

(II) Hidden Layer: For the first hidden layer,N1,1 enforces

λk , N1,2 enforces μk, the corresponding connection weights

are

ω(0,1),(1,1) = ω(0,2),(1,2) = α = 0.618,

ω(0,2),(1,1) = ω(0,1),(1,2) = 1 − α = 0.382,

then the output of neurons N1,1 and N1,2 are

O1,1 = ω(0,1),(1,1) ∗ ak + ω(0,2),(1,1) ∗ bk = λk,

O1,2 = ω(0,1),(1,2) ∗ ak + ω(0,2),(1,2) ∗ bk = μk.

For the second hidden layer, N2,1 and N2,2 are used to

enforce the output of λk, μk, respectively. Set ω(1,1),(2,1) =ω(1,2),(2,2) = 1, then

O2,1 = f(ω(1,1),(2,1) ∗ λk) = f(λk),O2,2 = f(ω(1,2),(2,2) ∗ μk) = f(μk).

For the third hidden layer, the input is

net3,1 = (ω(2,1),(3,1) ∗ O2,1) − (ω(2,2),(3,1) ∗ O2,2),

let ω(2,1),(3,1) = ω(2,2),(3,1) = 1, setting the activation

function is

ϕ(net3,1) ={

1, if net3,1 > 0,0, otherwise,

then the output is

O3,1 = f(λk) − f(μk).

(III) Output Layer: There are two neurons in the output

layer,which are used to calculate ak+1, bk+1, set

ω(3,1),(4,1) = ω(3,1),(4,2) = 1,

then the outputs are

O4,1 = O3,1 ∗ O1,1 + (1 − O3,1) ∗ O0,1

={

ak+1 = λk = O1,1, O3,1 = 1,ak+1 = ak = O0,1, O3,1 = 0,

O4,2 = O3,1 ∗ O0,2 + (1 − O3,1) ∗ O1,2

={

bk+1 = bk = O0,2, O3,1 = 1,bk+1 = μk = O1,2, O3,1 = 0.

(IV) Feedback Layer: The neurons N5,1 and N5,2 in

feedback layer are used to calculate λk+1 and μk+1,

O5,1 = 0.618(1 − O3,1) ∗ O4,1 +0.382(1 − O3,1) ∗ O4,2 + O3,1 ∗ O1,2

={

λk+1 = O1,2, O3,1 = 1,λk+1 = 0.618O4,1 + 0.382O4,2, O3,1 = 0,

O5,2 = 0.382O3,1 ∗ O4,1 +0.618O3,1 ∗ O4,2 + (1 − O3,1) ∗ O1,1

={

μk+1 = 0.382O4,1 + 0.618O4,2, O3,1 = 1,μk+1 = O1,1, O3,1 = 0.

(V) Iteration: Let the output neurons as the input of the

feedback layer,

If f(λk) > f(μk), then ak+1 = λk, bk+1 = bk, otherwise,

let ak+1 = ak, bk+1 = μk.

Let k = 1, k = k + 1,circulation,until |bk − ak| ≤ ε.

IV. THE NEURAL NETWORK ALGORITHM

For the generalized convex programming subject to linear

constraints, we can present the algorithm based on the neural

network structure in Section 3.

Step 1: We can obtain the upper bound and lower bound of

x according to the constrains condition, denote as a1, b1. Let

net0,1 = a1, net0,2 = b1, give ε ≥ 0 as iteration precision.

If |b1 − a1| < ε, let optimal solution x∗ = (a1 + b1)/2,

otherwise, let the initial solution is x(1) = (a1 + b1)/2, turn

to Step 2.

Step 2: Calculate O1,1 = λ1, O1,2 = μ1, O2,1 = f(λ1),O2,2 = f(μ1), let O3,1 = f(λ1) − f(μ1).

Step 3: If O3,1 > 0, let O4,1 = a2 = λ1,O4,2 = b2 = b1,

otherwise, let O4,1 = a2 = a1, O4,2 = b2 = μ1.

Step 4: When O3,1 > 0, let

O5,1 = λ2 = μ1,

O5,2 = μ2 = a2 + α(b2 − a2).

When O3,1 ≤ 0, let

O5,1 = λ2 = a2 + (1 − α)(b2 − a2),O5,2 = μ2 = λ1.

Step 5: Let k = 1, k = k + 1, and let net0,1 = ak,

net0,2 = bk, until |bk − ak| ≤ ε, then we can obtain x∗ =(ak + bk)/2.

Theorem 2: Used the neural network algorithm based

on 0.618 method to solve generalized convex programming

problem, the error of the approximate solution is less than

αk|b1 − a1|, where α = 0.618.

327

Proof: According to the neural network algorithm we

proposed, we can get [ak, bk],k = 1, 2, · · · , n, while |bk −ak| = αk|b1 − a1| → 0,(k → ∞), then when |bk − ak| < ε,

let x∗ = ak+bk

2 as the approximate optimal solution, then the

error is less then αk|b1 − a1|.

V. SIMULATION RESULT

In order to demonstrate the effectiveness and efficiency of

the proposed neural network, in this section, we discuss the

simulation results though two examples. The simulation is

conducted in MATLAB.

Example 1: Consider a one-dimension generalized convex

programming example as

{min f(x) = −x3 − xs. t. 0 ≤ x ≤ 1.

0 0.2 0.4 0.6 0.8 1−2

−1.8

−1.6

−1.4

−1.2

−1

−0.8

−0.6

−0.4

−0.2

0

x

f(x)

f(x)= −x3−x

Fig.3. is a 2-dimensional figure of f(x) = −x3 − x

According to the definition 1, we know the objective

function of this problem is quasiconvex rather than convex.

The exact solution of this problem is x∗ = 1, and the optimal

value f∗ = −2. We use the algorithm in Section 4 to solve

the above problem. Let ε = 10−5, iterate 23 times, the

problem is globally converge to an unique optimal solution

x∗ = 1, and the optimal value of the objective function

is f∗ = −2. Figure 4. show the transient behaviors of the

decision variable.

2 4 6 8 10 12 14 16 18 20 22 24−3

−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

t/s

deci

sion

var

iabl

e

Fig.4. Transient Behaviors of the decision variable

Example 2: Consider the following generalized convex

program:

⎧⎨⎩

min f(x1, x2) = −x1x2

s. t. 2 ≤ 2x1 + x2 ≤ 74 ≤ x1 + 2x2 ≤ 8.

00.5

11.5

2

2

2.5

3−6

−5

−4

−3

−2

−1

0

Fig.5. is a 3-dimensional figure of f(x1, x2) = −x1x2

According to the lemma 3, we know the objective function

of this problem is quasiconvex but not convex. We used

“quadprog” in Matlab toolbox to solve this problem, then

we can get the optimal solution x∗ = (2.0, 3.0)T , and the

optimal value f∗ = −6. We use the algorithm in Section 4 to

solve the above problem. Let x(1) = (1, 2.5)T , ε = 10−4, it-

erate 19 times, the problem is globally converge to an unique

optimal solution x∗ = (1.999, 3.000)T , and the optimal value

of the objective function is f∗ = −5.9997. Figure 6. show

the transient behaviors of the decision variable.

2 4 6 8 10 12 14 16 18 20−2

−1

0

1

2

3

x1

x2

t/s

deci

sion

var

iabl

e

Fig.6. TransientTT Behaviors of the decision variable

VI. CONCLUSION

This paper presents a new neural network algorithm for

solving generalized convex programming problems by us-

ing 0.618 method. It has also been substantiated that the

proposed neural network algorithm is able to generate opti-

mal solution to linear programming with bound constraints.

Compared with other neural network models, the objective

function of the quadratic programming in this algorithm

is universal, which can be convex function or quasiconvex

function. With the dimension increasing, the advantage of

this algorithm is more clearly. It has been shown that the

proposed algorithm is easy to implement in computer.

ACKNOWLEDGMENT

This work is supported by the Graduate Scientific

Research Creative Foundation of Three Gorges Univer-

sity, China(200949), the Scientific Innovation Team Project

of Hubei Provincial Department of Education(T200809),

the Natural Science Foundation of Hubei Province,

China(2008CDZ046) and the National Natural Science Foun-

dation, China (10726016).

behaviors of the decision variable

328

REFERENCES

[1] R.Horst, P.M.Pardalos and N.V.Thoai, Introduction to Global Optimiza-tion, Tsinghua University Publishing House, 2005.

[2] C.L.Song, Z.Q.Xia and L.W.Zhang, “A note on the upper semi-continuity of Demyanov sum of quasidifferential mappinggs,” ORTransactions, vol.11, no.1, pp.33-38, 2007.

[3] T.S. Du, P.S. Fei and J.G.Jian, “A new branch and bound algorithmfor nonconvex quadratic programming global minimization,”ComputerEngineering and Applications , vol. 44, no. 17, pp.49-52, 2008.

[4] Stephem Boyd and Vandenberghe Lieven, Convex Optimization, Cam-bridge University Press, 2004.

[5] J.Liu and Y.Gao, “A subalgorithm for quasidifferentiable equations,”International J Pure and Applied Mathematics, vol. 23, no. 3, pp.335-342, 2005.

[6] H.J. Greenberg and W.P.Pierskalla, “Surrogale mathematical programs,”OR, vol. 18, pp.924-939, 1970.

[7] H.J. Greenberg and W.P.Pierskalla, “A review of quasi-convex func-tions,” OR, vol. 19, pp.1553-1570, 1971.

[8] W.Fenchel, Convex Set and Functions, Princeton University, PrincetonNew Jersey, 1951.

[9] K.J. Arrow and A.C. Enthoven, “Quasi-concave programing,”Economitrica, vol. 29, pp.779-800, 1961.

[10] J.P.Crouzeix and J.A.Ferland, “Criteria for quasi-convexity and pseu-doconvexity:ralationship and comparisons,” Math. Prog., vol. 23,pp.193-205, 1982.

[11] X.M.Yang, “Some properties of quasiconvex function,” Journal ofEngineering Mathematics, vol. 10, no.1, pp.51-56, 1993.

[12] X.M.Yang, “A note on criteria of quasiconvex functiom,” OR Trans-actions, vol.5, no.2, pp.55-56, 2001.

[13] X.M.Yang and S.Y.Liu, “Three kinds of generalized convexity,” Jour-nal of Optimization Theory and Applications , vol. 86, no.2, pp.501-513,1995.

[14] Y.S.Xia, G.Feng and J.Wang, “A recurrent neural network with expo-nential convergence for solving convex quadratic program and relatedlinear piecewise equation,” Neural Networks, vol. 17, pp.1003-l0l5,2004.

[15] Y.S. Xia and G.Feng, “An improved network for convex quadraticoptimization with application to real time beam forming,” NeuralComputing, vol. 64, pp.359-370, 2005.

[16] Cichocki.A and Unbehauer.R., “Neural networks for optimization withbounded constraints,” IEEE Trans. on Neural Network, vol. 4, pp.293-304, 1993.

[17] J.J.Hopfield and D.W.Tank, “Neural computation of decisions inoptimization problems,” Biol. Cybern., vol. 52, pp.141-152, 1985

[18] D.X.He, “Neural network for solving linear program based on bisec-tion method,” Computer Engineering and Applications vol. 20, no.102,pp.74-75, 2006.

[19] J.L.Yang and T.S.Du, “A neural algorithm for solving quadraticprogramming based on 0.618 method,” Computer Engineering andApplications, vol. 46, no. 24, pp.37-39, 2010.

[20] J.L.Yang and T.S.Du, “A neural algorithm for solving quadraticprogramming based on Fibonicca method,” ISNN 2010, Part I, LNCS6063, pp.118-125, 2010.

[21] Y.X.Yuan and W.Y.Sun, Optimization Theory and Method, SciencePress, 1997.

329

[ieee 2011 fourth international workshop on advanced computational intelligence (iwaci) - wuhan,...

Documents