chapter 9 perceptrons and their generalizations. rosenblatt ’ s perceptron proofs of the theorem...
TRANSCRIPT
Chapter 9Perceptrons and their generalizations
Rosenblatt’s perceptron Proofs of the theorem Method of stochastic approximation and sig
moid approximation of indicator functions Method of potential functions and Radial ba
sis functions Three theorem of optimization theory Neural Networks
Perceptrons (Rosenblatt, 1950s)
Recurrent Procedure
Proofs of the theorems
Method of stochastic approximation and sigmoid approximation of indicator functions
Method of Stochastic Approximation
Sigmoid Approximation of Indicator Functions
Basic Frame for learning process
Use the sigmoid approximation at the stage of estimating the coefficients
Use the indicator functions at the stage of recognition.
Method of potential functions and Radial Basis Functions
Potential function On-line Only one element of the training data
RBFs (mid-1980s) Off-line
Method of potential functions in asymptotic learning theory
Separable condition Deterministic setting of the PR
Non-separable condition Stochastic setting of the PR problem
Deterministic Setting
Stochastic Setting
RBF Method
Three Theorems of optimization theory
Fermat’s theorem (1629) Entire space, without constraints
Lagrange multipliers rule (1788) Conditional optimization problem
Kuhn-Tucker theorem (1951) Convex optimizaiton
To find the stationary points of functions
It is necessary to solve a system of n equations with n unknown values.
Lagrange Multiplier Rules (1788)
Kuhn-Tucker Theorem (1951)
Convex optimization Minimize a certain type of (convex)
objective function under certain (convex) constraints of inequality type.
Remark
Neural Networks
A learning machine: Nonlinearly mapped input vector x in
feature space U Constructed a linear function into this
space.
Neural Networks
The Back-Propagation method The BP algorithm Neural Networks for the
Regression estimation problem Remarks on the BP method.
The Back-Propagation method
The BP algorithm
For the regression estimation problem
Remark
The empirical risk functional has many local minima
The convergence of the gradient based method is rather slow.
The sigmoid function has a scaling factor that affects the quality of the approximation.
Neural-networks are not well-controlled learning machines
In many practical applications, however, demonstrates good results.