[ieee 2011 fourth international workshop on advanced computational intelligence (iwaci) - wuhan,...
TRANSCRIPT
A Neural Network Algorithm for Solving Generalized ConvexProgramming
Jingli Yang and Tingsong Du
Abstract— In this paper, by using 0.618 method, a novelneural network algorithm is proposed. This neural networkcan solve generalized convex programming subject to linearconstraints. Finally, examples are provided to show the appli-cability of the proposed neural network algorithm.
I. INTRODUCTION
O PTIMIZATION problem is an important part of oper-
ations research, which has many applications in nat-
ural science, social science, productive practice, engineering
design and modern management[1]. As an important branch
of the operations research, the mathematical programming
has more profound influence. For nearly three decades,
convexity theory has been widely applied in the fields
of optimization[2]-[3], while in many cases, the objective
function is convex is only the sufficient conditions rather than
the necessary conditions for mathematical programming. At
the same time, many important properties were discussed
in convex analysis only need the level set of the function
is convex, it means that the function is generalized convex
function[4]-[5]. So the generalized convexity has become a
new development trend in mathematical programming. In
1970, B.DeFinetti[6]-[7] first studied the function which level
set is convex set, he found this kind of functions includ-
ing all convex functions and some nonconvex functions.
W.Fenchel[8] is the first scholar who called it quasicon-
vex function, and researched its properties systematically.
M.Slater has extended Kuhn-Tucker saddle point theory
and applied it to generalized convex programming[6]-[7],
K.J.Arrow and A.C.Enthoven in [9] studied quasiconcave and
applied it in economy, since then, more and more scholars
engaged in generalized convexity and made a series of impor-
tant achievements, such as J.A.Ferland, J.P.Crouzeix[10]and
X.M.Yang[11]-[13].
It is well known that neural network is a parallel, dis-
tributed information processing structure consisting of pro-
cessing elements. Each processing element has a single
output connection which branches into as many collateral
connections as desired, which makes neural network has
strong learning ability and can implement large-scale par-
allel computing. Neural network provided a new approach
for optimization problem and had been received consider-
able attention[14]-[15]. Compared with traditional numer-
ical methods, the neural network approach can solve the
optimization problems much faster in running time[16]. In
Jingli Yang and Tingsong Du are with Institute of Nonlinear and ComplexSystems, China Three Gorges University, Yichang, Hubei, 443002, China(jennyyang [email protected](J.L. Yang)).
1985, Hopfield and Tank first proposed a neural network
for solving Traveling Salesman Problem(TSP)[17], after that,
many neural network models are proposed and used in linear
programming and nonlinear programming, while most of
the objective function is convex[18]-[20]. In this paper, by
employing 0.618 method and the characteristic of common
network topology, we present a neural network algorithm
for solving the generalized convex programming with linear
constraints.
The organization of this paper is as follows. In Section II,
we introduce the theoretical foundation about convex analysis
and 0.618 method. In Section III, we present the neural
network based on 0.618 method. The neural network algo-
rithm is proposed in Section IV. In Section V, two examples
are discussed to evaluate the effectiveness of the proposed
neural network algorithm. Finally, Section VI concludes the
conclusions.
II. PRELIMINARIES
A. Generalized Convexity
The research on convexity and generalized convexity is
one of the most important aspects of mathematical pro-
gramming. In this section, we give some properties and
relations about convex, quasiconvex, strictly quasiconvex,
pseudoconvex functions.
Definition 1: A real-valued function f(x), defined on a
nonempty convex set Ω of Rn, is said to be quasiconvex if
f(λx1 + (1 − λ)x2) ≤ max{f(x1), f(x2)}, ∀λ ∈ [0, 1],
for every x1 ∈ Ω, x2 ∈ Ω.
A real-valued function f(x), defined on a nonempty con-
vex set Ω of Rn, is said to be strictly quasiconvex if
f(λx1 + (1 − λ)x2) < max{f(x1), f(x2)}, ∀λ ∈ (0, 1),
for every x1 ∈ Ω, x2 ∈ Ω, f(x1) �= f(x2).Definition 2: A real-valued function f(x), defined on a
nonempty convex set Ω of Rn, is called level set if
Hr(f) = {x|x ∈ Ω, f(x) ≤ r},for r ∈ [−∞,+∞].
Lemma 1 [20]: A function f : Ω → R is called
quasiconvex if and only if its domain and all its sublevel
sets Hr(f) = {x|x ∈ Ω, f(x) ≤ r},∀r ∈ R, are convex
sets.
Remark 1: Convex functions have convex sublevel sets,
and so are quasiconvex. But simple examples, such as the
one shown in figure 1, show that the converse is not true.
Fourth International Workshop on Advanced Computational Intelligence Wuhan, Hubei, China; October 19-21, 2011
978-1-61284-375-9/11/$26.00 @2011 IEEE
1970, B.DeFinetti[6], [7] first studied the function which level
and applied it to generalized convex programming[6], [7],
able attention[14], [15]. Compared with traditional numer-
function[4], [5]. So the generalized convexity has become a
325
50 100 150 200 250 300 350 400 450 500
50
100
150
200
Fig.1. A quasiconvex function on R.
For each α, the α-sublevel set Hα is convex, i.e., an
interval. The sublevel set Hα is the interval [a, b]. The βsublevel set Hβ is the interval (−∞, c].
Definition 3: A real-valued function f(x), defined on a
nonempty convex set Ω of Rn, is said to be pseudoconvex
when
∇f(x1)T (x1 − x2) ≤ 0, ∀x1, x2 ∈ Ω,
we can get f(x1) ≤ f(x2).A real-valued function f(x), defined on a nonempty con-
vex set Ω of Rn, is said to be strictly pseudoconvex when
∇f(x1)T (x1 − x2) ≤ 0, ∀x1, x2 ∈ Ω, x1 �= x2
we can get f(x1) < f(x2).Lemma 2 [20]: A function f : Ω → R is differentiable,
if f is pseudoconvex on Ω, then we can get f is strictly
quasiconvex and quasiconvex on Ω.
We consider the following optimization programming
problem:
(GCP)
{minimize f(x)subject to gi(x) ≤ 0, i = 1, 2, . . . , m.
if the feasible set of (GCP) is convex set, noted F , and f(x)is (strictly) quasiconvex or (strictly) pseudoconvex on F , then
(GCP) is said to be generalized convex programming.
Theorem 1: Let F be a nonempty convex set, f(x) is
strictly quasiconvex on F , then the local optimal solution of
(GCP) is also its globally optimal solution.
Proof: By contradiction, suppose that x∗ isn’t the globally
optimal solution of (GCP), then existing x̄ ∈ F , x̄ �= x∗ such
that
f(x̄) < f(x∗).
As f(x) is strictly quasiconvex function, then for ∀λ ∈ (0, 1),we have
f(λx̄ + (1 − λ)x∗) < max{f(x̄), f(x∗)} = f(x∗). (1)
When λ is small enough, then λx̄ + (1 − λ)x∗ ∈F
⋂N(x∗, ε), ε > 0. Therefore, (1) is contradict to x∗ is
the local optimal solution of (GCP).
Definition 4: If f(x) is quasiconvex and isn’t convex, we
called f(x) is only quasiconvex function.
In order to judge whether the second-order function is
quasiconvex function, we quote following lemma.
Lemma 3 [4]: Consider f(x) = 12xT Ax + cT x, where
x = (x1, x2, · · · , xn)T ∈ Rn, c ∈ Rn, A ∈ Rn×n is a
real symmetric matrices, if A and the Hessian ∇f2(x) only
have a negative eigenvalue, then f(x) is only quasiconvex.
∇f2(x) is defined as
∇f2(x) =[
0 ∇f(x)T
∇f(x) 0
].
B. Description of the 0.618 Method [21]
The 0.618 method is used to solve one dimension op-
timization problems, which is require no derivative. The
objective function can be convex or quasiconvex. Its main
characteristic is use α = 0.618 and 1 − α = 0.382 after the
search interval is gained. We can use the following step to
solve the optimization problems:
Step 1: Give the constrain condition a1 ≤ x ≤ b1 and the
iteration precision ε, we set
λ1 = a1 + (1 − α)(b1 − a1)= αa1 + (1 − α)b1 = 0.618a1 + 0.382b1,
μ1 = a1 + α(b1 − a1)= (1 − α)a1 + αb1 = 0.382a1 + 0.618b1,
put k = 1.Step 2: If |bk − ak| < ε, end. The optimal solution x∗ ∈
[ak, bk], let x∗ = (ak + bk)/2, otherwise, calculate f(λk),f(μk). If f(λk) > f(μk), turn to step 3, otherwise, turn to
step 4.
Step 3: Let ak+1 = λk, bk+1 = bk, and let
λk+1 = μk,
μk+1 = ak+1 + α(bk+1 − ak+1) = 0.382ak+1 + 0.618bk+1,
calculate f(μk+1).Step 4: Let ak+1 = ak, bk+1 = μk, and let
λk+1 = ak+1 + (1 − α)(bk+1 − ak+1)= 0.618ak+1 + 0.382bk+1,
μk+1 = λk,
calculate f(λk+1). Step 5: Let k = k + 1, turn to Step 2.
III. NEURAL NETWORK STRUCTURE
According to 0.618 method, we design the neural network
as follow:
50 100 150 200 250 300 350 400 450 500
50
100
150
Fig.2. Neural network structure based on 0.618 method
326
We suppose that:Ni,j means the neuron j in layer i,neti,j denote as the input value of neuron Ni,j , Oi,j means
the output value of neuron Ni,j , and ω(i,j),(k,l) means the
connection weight from Ni,j to Nk,l. Biasing threshold
vector is defined by θ = 0.
The neural network structure in Fig.2 contains 6 layers,
including input layer, three hidden layers, one feedback layer
and output layer. We arrange calculation steps below to solve
quadratic programming.
(I) Input Layer: Let [ak, bk] as the input of N0,1 and N0,2,
then
net0,1 = ak = a1,
net0,2 = bk = b1,
the activation functions are defined by ϕ0,1(x) = x,
ϕ0,2(y) = y, then the outputs of N0,1 , N0,2 are
O0,1 = ϕ0,1(net0,1) = a1,
O0,2 = ϕ0,1(net0,2) = b1.
(II) Hidden Layer: For the first hidden layer,N1,1 enforces
λk , N1,2 enforces μk, the corresponding connection weights
are
ω(0,1),(1,1) = ω(0,2),(1,2) = α = 0.618,
ω(0,2),(1,1) = ω(0,1),(1,2) = 1 − α = 0.382,
then the output of neurons N1,1 and N1,2 are
O1,1 = ω(0,1),(1,1) ∗ ak + ω(0,2),(1,1) ∗ bk = λk,
O1,2 = ω(0,1),(1,2) ∗ ak + ω(0,2),(1,2) ∗ bk = μk.
For the second hidden layer, N2,1 and N2,2 are used to
enforce the output of λk, μk, respectively. Set ω(1,1),(2,1) =ω(1,2),(2,2) = 1, then
O2,1 = f(ω(1,1),(2,1) ∗ λk) = f(λk),O2,2 = f(ω(1,2),(2,2) ∗ μk) = f(μk).
For the third hidden layer, the input is
net3,1 = (ω(2,1),(3,1) ∗ O2,1) − (ω(2,2),(3,1) ∗ O2,2),
let ω(2,1),(3,1) = ω(2,2),(3,1) = 1, setting the activation
function is
ϕ(net3,1) ={
1, if net3,1 > 0,0, otherwise,
then the output is
O3,1 = f(λk) − f(μk).
(III) Output Layer: There are two neurons in the output
layer,which are used to calculate ak+1, bk+1, set
ω(3,1),(4,1) = ω(3,1),(4,2) = 1,
then the outputs are
O4,1 = O3,1 ∗ O1,1 + (1 − O3,1) ∗ O0,1
={
ak+1 = λk = O1,1, O3,1 = 1,ak+1 = ak = O0,1, O3,1 = 0,
O4,2 = O3,1 ∗ O0,2 + (1 − O3,1) ∗ O1,2
={
bk+1 = bk = O0,2, O3,1 = 1,bk+1 = μk = O1,2, O3,1 = 0.
(IV) Feedback Layer: The neurons N5,1 and N5,2 in
feedback layer are used to calculate λk+1 and μk+1,
O5,1 = 0.618(1 − O3,1) ∗ O4,1 +0.382(1 − O3,1) ∗ O4,2 + O3,1 ∗ O1,2
={
λk+1 = O1,2, O3,1 = 1,λk+1 = 0.618O4,1 + 0.382O4,2, O3,1 = 0,
O5,2 = 0.382O3,1 ∗ O4,1 +0.618O3,1 ∗ O4,2 + (1 − O3,1) ∗ O1,1
={
μk+1 = 0.382O4,1 + 0.618O4,2, O3,1 = 1,μk+1 = O1,1, O3,1 = 0.
(V) Iteration: Let the output neurons as the input of the
feedback layer,
If f(λk) > f(μk), then ak+1 = λk, bk+1 = bk, otherwise,
let ak+1 = ak, bk+1 = μk.
Let k = 1, k = k + 1,circulation,until |bk − ak| ≤ ε.
IV. THE NEURAL NETWORK ALGORITHM
For the generalized convex programming subject to linear
constraints, we can present the algorithm based on the neural
network structure in Section 3.
Step 1: We can obtain the upper bound and lower bound of
x according to the constrains condition, denote as a1, b1. Let
net0,1 = a1, net0,2 = b1, give ε ≥ 0 as iteration precision.
If |b1 − a1| < ε, let optimal solution x∗ = (a1 + b1)/2,
otherwise, let the initial solution is x(1) = (a1 + b1)/2, turn
to Step 2.
Step 2: Calculate O1,1 = λ1, O1,2 = μ1, O2,1 = f(λ1),O2,2 = f(μ1), let O3,1 = f(λ1) − f(μ1).
Step 3: If O3,1 > 0, let O4,1 = a2 = λ1,O4,2 = b2 = b1,
otherwise, let O4,1 = a2 = a1, O4,2 = b2 = μ1.
Step 4: When O3,1 > 0, let
O5,1 = λ2 = μ1,
O5,2 = μ2 = a2 + α(b2 − a2).
When O3,1 ≤ 0, let
O5,1 = λ2 = a2 + (1 − α)(b2 − a2),O5,2 = μ2 = λ1.
Step 5: Let k = 1, k = k + 1, and let net0,1 = ak,
net0,2 = bk, until |bk − ak| ≤ ε, then we can obtain x∗ =(ak + bk)/2.
Theorem 2: Used the neural network algorithm based
on 0.618 method to solve generalized convex programming
problem, the error of the approximate solution is less than
αk|b1 − a1|, where α = 0.618.
327
Proof: According to the neural network algorithm we
proposed, we can get [ak, bk],k = 1, 2, · · · , n, while |bk −ak| = αk|b1 − a1| → 0,(k → ∞), then when |bk − ak| < ε,
let x∗ = ak+bk
2 as the approximate optimal solution, then the
error is less then αk|b1 − a1|.
V. SIMULATION RESULT
In order to demonstrate the effectiveness and efficiency of
the proposed neural network, in this section, we discuss the
simulation results though two examples. The simulation is
conducted in MATLAB.
Example 1: Consider a one-dimension generalized convex
programming example as
{min f(x) = −x3 − xs. t. 0 ≤ x ≤ 1.
0 0.2 0.4 0.6 0.8 1−2
−1.8
−1.6
−1.4
−1.2
−1
−0.8
−0.6
−0.4
−0.2
0
x
f(x)
f(x)= −x3−x
Fig.3. is a 2-dimensional figure of f(x) = −x3 − x
According to the definition 1, we know the objective
function of this problem is quasiconvex rather than convex.
The exact solution of this problem is x∗ = 1, and the optimal
value f∗ = −2. We use the algorithm in Section 4 to solve
the above problem. Let ε = 10−5, iterate 23 times, the
problem is globally converge to an unique optimal solution
x∗ = 1, and the optimal value of the objective function
is f∗ = −2. Figure 4. show the transient behaviors of the
decision variable.
2 4 6 8 10 12 14 16 18 20 22 24−3
−2.5
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
t/s
deci
sion
var
iabl
e
Fig.4. Transient Behaviors of the decision variable
Example 2: Consider the following generalized convex
program:
⎧⎨⎩
min f(x1, x2) = −x1x2
s. t. 2 ≤ 2x1 + x2 ≤ 74 ≤ x1 + 2x2 ≤ 8.
00.5
11.5
2
2
2.5
3−6
−5
−4
−3
−2
−1
0
Fig.5. is a 3-dimensional figure of f(x1, x2) = −x1x2
According to the lemma 3, we know the objective function
of this problem is quasiconvex but not convex. We used
“quadprog” in Matlab toolbox to solve this problem, then
we can get the optimal solution x∗ = (2.0, 3.0)T , and the
optimal value f∗ = −6. We use the algorithm in Section 4 to
solve the above problem. Let x(1) = (1, 2.5)T , ε = 10−4, it-
erate 19 times, the problem is globally converge to an unique
optimal solution x∗ = (1.999, 3.000)T , and the optimal value
of the objective function is f∗ = −5.9997. Figure 6. show
the transient behaviors of the decision variable.
2 4 6 8 10 12 14 16 18 20−2
−1
0
1
2
3
x1
x2
t/s
deci
sion
var
iabl
e
Fig.6. TransientTT Behaviors of the decision variable
VI. CONCLUSION
This paper presents a new neural network algorithm for
solving generalized convex programming problems by us-
ing 0.618 method. It has also been substantiated that the
proposed neural network algorithm is able to generate opti-
mal solution to linear programming with bound constraints.
Compared with other neural network models, the objective
function of the quadratic programming in this algorithm
is universal, which can be convex function or quasiconvex
function. With the dimension increasing, the advantage of
this algorithm is more clearly. It has been shown that the
proposed algorithm is easy to implement in computer.
ACKNOWLEDGMENT
This work is supported by the Graduate Scientific
Research Creative Foundation of Three Gorges Univer-
sity, China(200949), the Scientific Innovation Team Project
of Hubei Provincial Department of Education(T200809),
the Natural Science Foundation of Hubei Province,
China(2008CDZ046) and the National Natural Science Foun-
dation, China (10726016).
behaviors of the decision variable
328
REFERENCES
[1] R.Horst, P.M.Pardalos and N.V.Thoai, Introduction to Global Optimiza-tion, Tsinghua University Publishing House, 2005.
[2] C.L.Song, Z.Q.Xia and L.W.Zhang, “A note on the upper semi-continuity of Demyanov sum of quasidifferential mappinggs,” ORTransactions, vol.11, no.1, pp.33-38, 2007.
[3] T.S. Du, P.S. Fei and J.G.Jian, “A new branch and bound algorithmfor nonconvex quadratic programming global minimization,”ComputerEngineering and Applications , vol. 44, no. 17, pp.49-52, 2008.
[4] Stephem Boyd and Vandenberghe Lieven, Convex Optimization, Cam-bridge University Press, 2004.
[5] J.Liu and Y.Gao, “A subalgorithm for quasidifferentiable equations,”International J Pure and Applied Mathematics, vol. 23, no. 3, pp.335-342, 2005.
[6] H.J. Greenberg and W.P.Pierskalla, “Surrogale mathematical programs,”OR, vol. 18, pp.924-939, 1970.
[7] H.J. Greenberg and W.P.Pierskalla, “A review of quasi-convex func-tions,” OR, vol. 19, pp.1553-1570, 1971.
[8] W.Fenchel, Convex Set and Functions, Princeton University, PrincetonNew Jersey, 1951.
[9] K.J. Arrow and A.C. Enthoven, “Quasi-concave programing,”Economitrica, vol. 29, pp.779-800, 1961.
[10] J.P.Crouzeix and J.A.Ferland, “Criteria for quasi-convexity and pseu-doconvexity:ralationship and comparisons,” Math. Prog., vol. 23,pp.193-205, 1982.
[11] X.M.Yang, “Some properties of quasiconvex function,” Journal ofEngineering Mathematics, vol. 10, no.1, pp.51-56, 1993.
[12] X.M.Yang, “A note on criteria of quasiconvex functiom,” OR Trans-actions, vol.5, no.2, pp.55-56, 2001.
[13] X.M.Yang and S.Y.Liu, “Three kinds of generalized convexity,” Jour-nal of Optimization Theory and Applications , vol. 86, no.2, pp.501-513,1995.
[14] Y.S.Xia, G.Feng and J.Wang, “A recurrent neural network with expo-nential convergence for solving convex quadratic program and relatedlinear piecewise equation,” Neural Networks, vol. 17, pp.1003-l0l5,2004.
[15] Y.S. Xia and G.Feng, “An improved network for convex quadraticoptimization with application to real time beam forming,” NeuralComputing, vol. 64, pp.359-370, 2005.
[16] Cichocki.A and Unbehauer.R., “Neural networks for optimization withbounded constraints,” IEEE Trans. on Neural Network, vol. 4, pp.293-304, 1993.
[17] J.J.Hopfield and D.W.Tank, “Neural computation of decisions inoptimization problems,” Biol. Cybern., vol. 52, pp.141-152, 1985
[18] D.X.He, “Neural network for solving linear program based on bisec-tion method,” Computer Engineering and Applications vol. 20, no.102,pp.74-75, 2006.
[19] J.L.Yang and T.S.Du, “A neural algorithm for solving quadraticprogramming based on 0.618 method,” Computer Engineering andApplications, vol. 46, no. 24, pp.37-39, 2010.
[20] J.L.Yang and T.S.Du, “A neural algorithm for solving quadraticprogramming based on Fibonicca method,” ISNN 2010, Part I, LNCS6063, pp.118-125, 2010.
[21] Y.X.Yuan and W.Y.Sun, Optimization Theory and Method, SciencePress, 1997.
329