computing all invariant states of a neural network

1
COMPUTING ALL INVARIANT STATES OF A NEURAL NETWORK. Bart De Moor, Lieven Vandenberghe, Joos Vandewalle ESAT, El. Eng. Dept., Katholieke Universiteit Leuven, K.Mereierlaan 94, 3030 Heverlee, Belgium. Conventional methods for the training of neural networks (e.g. the outer- product rule or the spectral algorithm) do not succeed in avoiding the occur- fence of undesired ("spurious") invariant states, and this can result in bad con- vergence. The problem of finding all invariant states for a given neural net is therefore an important design problem. In this presentation it will be shown how this question can be resolved for both continuous neural networks of the type dv(t)/dt = -Av(t) + B.~(Tv(t) + s) and discrete-time models Av(k + i) = B.~(Tv(k) + s) , where v denotes the state vector, s the input, and Y" is an arbitrary piecewise- linear function. This general form includes the popular MeCulloeh-Pitts model [1}. The invariant states of the models given above are the solutions to the set of piecewise-linear equations Av = B.,V(Tv + s). Next, the equivalence between this set of nonlinear equations and the Gen- eralized Linear Complementarity Problem (GLCP) will be established. The formulation of this GLC.P reads : Given real matrices M, N E R TM and a vector z E R r', find all q x 1 vectors v, u, and scalars c~ such that My + Nw = za and v > 0, w > 0,c~ ~ O, vtu, = 0. This problem originates from mathematical programnring and has also been applied to the analysis of pieeewise-linear resistive eirenits. Moreover, an algorithm for finding all solutions to this problem has recently been proposed [2]. The related !,r,~b!em of finding all invariant states that share a prespe..'ified amount ef partial information can be solved with the same techniques. These results have a theoretical and conceptual importance as well. The GLCP has been studied in connection with optimization problems, and can therefore be expected to yield insight in the variational fornmlation of the dy- namical behaviour of a neural net. Another interesting issue from the GLCP- literature is the extensive body of knowledge concerning the characterization of the number of solutions from the classes of matrices in the GLCP-equation. These results might be useful in studying the information capacity of a neural net and in resolving the problem of designing a net without spurious states. References [1} Grossberg, S., (1988). Nonlinear neural networks : principles, mechanisms, and architectures. Neural Networks. 1, 17-61. [2] De Moor, B., Vandenberghe, L., and Vandewalle, J. The generalized linear eonwlementarity problem and an algoritlun to find all its solutions. Submit- ted to Mathematical Program,hint. 89

Upload: bart-de-moor

Post on 26-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computing all invariant states of a neural network

COMPUTING ALL INVARIANT STATES OF A NEURAL NETWORK. Bart De Moor, Lieven Vandenberghe, Joos Vandewalle ESAT, El. Eng. Dept., Katholieke Universiteit Leuven, K.Mereierlaan 94, 3030 Heverlee, Belgium.

Conventional methods for the training of neural networks (e.g. the outer- product rule or the spectral algorithm) do not succeed in avoiding the occur- fence of undesired ("spurious") invariant states, and this can result in bad con- vergence. The problem of finding all invariant states for a given neural net is therefore an important design problem. In this presentation it will be shown how this question can be resolved for both continuous neural networks of the type dv(t)/dt = - A v ( t ) + B.~(Tv(t) + s) and discrete-time models Av(k + i) = B.~(Tv(k) + s) , where v denotes the state vector, s the input, and Y" is an arbitrary piecewise- linear function. This general form includes the popular MeCulloeh-Pitts model [1}. The invariant states of the models given above are the solutions to the set of piecewise-linear equations Av = B.,V(Tv + s).

Next, the equivalence between this set of nonlinear equations and the Gen- eralized Linear Complementarity Problem (GLCP) will be established. The formulation of this GLC.P reads : Given real matrices M, N E R T M and a vector z E R r', find all q x 1 vectors v, u, and scalars c~ such that My + N w = za and v > 0, w > 0,c~ ~ O, vtu, = 0. This problem originates from mathematical programnring and has also been applied to the analysis of pieeewise-linear resistive eirenits. Moreover, an algorithm for finding all solutions to this problem has recently been proposed [2]. The related !,r,~b!em of finding all invariant states that share a prespe..'ified amount ef partial information can be solved with the same techniques.

These results have a theoretical and conceptual importance as well. The GLCP has been studied in connection with optimization problems, and can therefore be expected to yield insight in the variational fornmlation of the dy- namical behaviour of a neural net. Another interesting issue from the GLCP- literature is the extensive body of knowledge concerning the characterization of the number of solutions from the classes of matrices in the GLCP-equation. These results might be useful in studying the information capacity of a neural net and in resolving the problem of designing a net without spurious states.

R e f e r e n c e s

[1} Grossberg, S., (1988). Nonlinear neural networks : principles, mechanisms, and architectures. Neural Networks. 1, 17-61.

[2] De Moor, B., Vandenberghe, L., and Vandewalle, J. The generalized linear eonwlementarity problem and an algoritlun to find all its solutions. Submit- ted to Mathematical Program,hint.

89