nn07

20072007 ДОЦ. Йордан КолевДОЦ. Йордан Колев

НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ

20072007 Доц. Йордан КолевДоц. Йордан Колев

НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ• Изкуствените невронните мрежи (ИНМ или НМ) Изкуствените невронните мрежи (ИНМ или НМ)

са съставени от паралелно работещи елементи са съставени от паралелно работещи елементи наподобяващи биологичните нервни клетки. НМ наподобяващи биологичните нервни клетки. НМ има входове и изходи.има входове и изходи.

• Функциите на изкуствените невронни мрежи са Функциите на изкуствените невронни мрежи са определени основно от връзките между съставяопределени основно от връзките между съставя--щите ги елементи. Съвокупността от елементи и щите ги елементи. Съвокупността от елементи и връзки между тях определят структурата на НМвръзки между тях определят структурата на НМ..

• За всички връзки се задават тегловни За всички връзки се задават тегловни коефициенти на сигналите предавани между коефициенти на сигналите предавани между елементите. Съвокупността от тегловните елементите. Съвокупността от тегловните коефициенти се нарича още параметри на НМ.коефициенти се нарича още параметри на НМ.



Скрити слоеве (1 и 2)Скрити слоеве (1 и 2) Изходен слой (3) Изходен слой (3)


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ• Структурата на Структурата на NN NN се избира според конкрет-се избира според конкрет-

ното приложение и желаната функция.ното приложение и желаната функция.• За да функционира според предназначението си За да функционира според предназначението си

NNNN трябва да бъде обучена ( изключения). трябва да бъде обучена ( изключения). Обучението се извършва чрез подбор Обучението се извършва чрез подбор (настройка) на тегловните коефициенти на НМ.(настройка) на тегловните коефициенти на НМ.

• Обучението има за цел да достигне такова Обучението има за цел да достигне такова функциониране на НМ, при което определена функциониране на НМ, при което определена съвокупност от входни сигнали (входен вектор) съвокупност от входни сигнали (входен вектор) предизвиква извеждането на определен изходен предизвиква извеждането на определен изходен сигнал (изходен вектор).сигнал (изходен вектор).



• Обучението се извършва чрез сравнение на Обучението се извършва чрез сравнение на текущото състояние на изхода (при даден текущото състояние на изхода (при даден вход) с желаното (целта) и промяна на вход) с желаното (целта) и промяна на тегловните коефициенти докато изходът тегловните коефициенти докато изходът съвпадне с целта.съвпадне с целта.

• В процеса на обучение се използува В процеса на обучение се използува множество такива двойки вход/изход.множество такива двойки вход/изход.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИNeural Network Applications• The 1988 DARPA Neural Network Study [DARP88] lists

various neural network application,s beginning in about 1984 with the adaptive channel equalizer. This device, which is an outstanding commercial success, is a singleneuron network used in long distance telephone systems to stabilize voice signals. The DARPA report goes on to list other commercial applications, including a small word recognizer, a process monitor, a sonar classifier, and a risk analysis system.

• Neural networks have been applied in many other fields since the DARPA report was written. A list of some applications mentioned in the literature follows:..


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ• Aerospace. High performance aircraft autopilot, flight

path simulation, aircraft control systems, autopilot enhancements, aircraft component simulation, aircraft component fault detection;

• Automotive. Automobile automatic guidance system, warranty activity analysis;

• Banking. Check and other document reading, credit application evaluation;

• Defense. Weapon steering, target tracking, object discrimination, facial recognition, new kinds of sensors, sonar, radar and image signal processing including data compression, feature extraction and noise suppression, signal/image identification;


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ• Electronics. Code sequence prediction, integrated circuit

chip layout, process control, chip failure analysis, machine vision, voice synthesis, nonlinear modeling;

• Entertainment. Animation, special effects, market forecasting;

• Financial. Real estate appraisal, loan advisor, mortgage screening, corporate bond rating, credit line use analysis, portfolio trading program, corporate financial analysis, currency price prediction

• Insurance. Policy application evaluation, product optimization;

• Oil and Gas. Exploration;


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ• Manufacturing. Manufacturing process control, product

design and analysis, process and machine diagnosis, real-time particle identification, visual quality inspection systems, beer testing, welding quality analysis, paper quality prediction, computer chip quality analysis, analysis of grinding operations, chemical product design analysis, machine maintenance analysis, project bidding, planning and management, dynamic modeling of chemical process system;

• Medical. Breast cancer cell analysis, EEG and ECG analysis, prosthesis design, optimization of transplant times, hospital expense reduction, hospital quality improvement, emergency room test advisement;

• Robotics. Trajectory control, forklift robot, manipulator controllers, vision systems.



• Speech. Speech recognition, speech compression, vowel classification, text to speech synthesis;

• Securities. Market analysis, automatic bond rating, stock trading advisory systems;

• Telecommunications. Image and data compression, automated information services, real-time translation of spoken language, customer payment processing systems;

• Transportation. Truck brake diagnosis systems, vehicle scheduling, routing systems;



• Summary• The list of additional neural network applications,

the money that has been invested in neural network software and hardware, and the depth and breadth of interest in these devices have been growing rapidly. It is hoped that the NN toolbox of Matlab will be useful for neural network educational and design purposes within a broad field of neural network applications.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИFeedforward Network



Скрити слоеве (1 и 2)Скрити слоеве (1 и 2) Изходен слой (3) Изходен слой (3)



Статична мрежа с паралелен вход



Динамична мрежа с последователен вход


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИFIR Adaptive Filter


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИSummarySummary• The inputs to a neuron include its bias and the sum of its The inputs to a neuron include its bias and the sum of its

weighted inputsweighted inputs (using the inner product). The output of a (using the inner product). The output of a neuron depends on the neuron’sneuron depends on the neuron’s inputs and on its transfer inputs and on its transfer function. There are many useful transfer functions.function. There are many useful transfer functions.

• A single neuron cannot do very much. However, several A single neuron cannot do very much. However, several neurons can beneurons can be combined into a layer or multiple layers combined into a layer or multiple layers that have great power. that have great power.

• The architecture of a network consists of a description of The architecture of a network consists of a description of how many layers ahow many layers a network has, the number of neurons in network has, the number of neurons in each layer, each layer’s transfereach layer, each layer’s transfer function, and how the function, and how the layers are connected to each other. The best architecturelayers are connected to each other. The best architecture

to use depends on the type of problem to be represented to use depends on the type of problem to be represented by the network.by the network.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ• Except for purely linear networks, the more neurons in a Except for purely linear networks, the more neurons in a

hidden layer the more powerful the network.hidden layer the more powerful the network.• If a linear mapping needs to be represented, linear If a linear mapping needs to be represented, linear

neurons should be used.neurons should be used. However, linear networks However, linear networks cannot perform any nonlinear computation. Use of acannot perform any nonlinear computation. Use of a nonlinear transfer function makes a network capable of nonlinear transfer function makes a network capable of storing nonlinearstoring nonlinear relationships between input and output.relationships between input and output.

• A very simple problem may be represented by a single A very simple problem may be represented by a single layer of neurons.layer of neurons. However, single layer networks cannot However, single layer networks cannot solve certain problems. Multiplesolve certain problems. Multiple feed-forward layers give feed-forward layers give a network greater freedom. For example, anya network greater freedom. For example, any reasonable reasonable function can be represented with a two layer network: a function can be represented with a two layer network: a sigmoidsigmoid layer feeding a linear output layer.layer feeding a linear output layer.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ• Networks with biases can represent relationships between Networks with biases can represent relationships between

inputs and outputsinputs and outputs more easily than networks without biases. more easily than networks without biases. (For example, a neuron without a(For example, a neuron without a bias will always have a net bias will always have a net input to the transfer function of zero when all of itsinput to the transfer function of zero when all of its inputs are inputs are zero. However, a neuron with a bias can learn to have any zero. However, a neuron with a bias can learn to have any netnet transfer function input under the same conditions by transfer function input under the same conditions by learning an appropriatelearning an appropriate value for the bias.)value for the bias.)

• Feed-forward networks cannot perform temporal Feed-forward networks cannot perform temporal computation. More complexcomputation. More complex networks with internal feedback networks with internal feedback paths are required for temporal behavior.paths are required for temporal behavior.

• If several input vectors are to be presented to a network, If several input vectors are to be presented to a network, they may be presentedthey may be presented sequentially or concurrently. sequentially or concurrently.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ• Съществуват много методи за обучение, Съществуват много методи за обучение,

които се отличават по начина за настройка на които се отличават по начина за настройка на параметрите на НМ, от там по броя на параметрите на НМ, от там по броя на изчислителните операции и съответно по изчислителните операции и съответно по скоростта на сходимост към точката, в която скоростта на сходимост към точката, в която се приема, че изходът на НМ е достатъчно се приема, че изходът на НМ е достатъчно близо до целта.близо до целта.

• Съществуват и класове НМ, например Съществуват и класове НМ, например линейни НМ или НМ на Хопфилд, които се линейни НМ или НМ на Хопфилд, които се проектират директно и не се нуждаят от проектират директно и не се нуждаят от обучениеобучение


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИWe will define a We will define a learning rule learning rule as a procedure for modifying the as a procedure for modifying the

weights andweights and biases of a network. (This procedure may also biases of a network. (This procedure may also be referred to as a trainingbe referred to as a training algorithm.) The learning rule is algorithm.) The learning rule is applied to train the network to perform someapplied to train the network to perform some particular task.particular task.

Learning rules inLearning rules in Matlab toolbox fall into two broad categories: Matlab toolbox fall into two broad categories: supervised learning and unsupervised learning.supervised learning and unsupervised learning.

• In In supervised learningsupervised learning, the learning rule is provided with a , the learning rule is provided with a set of examples (the set of examples (the training settraining set) of proper network ) of proper network behavior: behavior: {p1, t1} , {p2,t2} ,....., {pQ,tQ}

where where pq is an input to the network, and is the is an input to the network, and is the

corresponding correct (corresponding correct (targettarget) ) tq output. output.



As the inputs are applied to the network, the network outputs As the inputs are applied to the network, the network outputs are compared to the targets. The learning rule is then used to are compared to the targets. The learning rule is then used to adjust the weights and biases of the network in order to move adjust the weights and biases of the network in order to move the network outputs closer to the targets. the network outputs closer to the targets.

• In In unsupervised learningunsupervised learning, the weights and biases are modified , the weights and biases are modified in response to network inputs only. There are no target in response to network inputs only. There are no target outputs available. Most of these algo-rithms perform clustering outputs available. Most of these algo-rithms perform clustering operations. They categorize the input patterns into a finite operations. They categorize the input patterns into a finite number of classes. This is especially useful in such number of classes. This is especially useful in such applications as vector quantization.applications as vector quantization.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИBackpropagation AlgorithmBackpropagation Algorithm

There are many variations of the backpropagation There are many variations of the backpropagation algorithm. The simplest implementation of algorithm. The simplest implementation of backpropagation learning updates the network backpropagation learning updates the network weights and biases in the direction in which the weights and biases in the direction in which the performance function decreases most rapidly – the performance function decreases most rapidly – the negative of the gradient. One iteration of this negative of the gradient. One iteration of this algorithm can be writtenalgorithm can be written

xk + 1 = xk – akgk

where where xk is a vector of current weights and biases, is a vector of current weights and biases, gk

is the current gradient, and is the current gradient, and ak is the learning rate.is the learning rate.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИThere are two different ways in which this gradient There are two different ways in which this gradient descent algorithm can be implemented: incremental descent algorithm can be implemented: incremental mode and batch mode. In the incremental mode, the mode and batch mode. In the incremental mode, the gradient is computed and the weights are updated gradient is computed and the weights are updated after each input is applied to the network. In the batch after each input is applied to the network. In the batch mode all of the inputs are applied to the network mode all of the inputs are applied to the network before the weights are updated. before the weights are updated.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИПерцептрон


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИSummarySummary• Perceptrons are useful as classifiers. They can classify Perceptrons are useful as classifiers. They can classify

linearly separable input vectors very well. Convergence is linearly separable input vectors very well. Convergence is guaranteed in a finite number of steps providing the guaranteed in a finite number of steps providing the perceptron can solve the problem.perceptron can solve the problem.

• Single-layer perceptrons can solve problems only when data Single-layer perceptrons can solve problems only when data is linearly separable. This is seldom the case. One solution to is linearly separable. This is seldom the case. One solution to this difficulty is to use a pre-processing method that results in this difficulty is to use a pre-processing method that results in linearly separable vectors. Or you might use multiple linearly separable vectors. Or you might use multiple perceptrons in multiple layers. Alternatively, you can use other perceptrons in multiple layers. Alternatively, you can use other kinds of networks such as linear networks or backpro-pagation kinds of networks such as linear networks or backpro-pagation networks, which can classify nonlinearly separable input networks, which can classify nonlinearly separable input vectors.vectors.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИPerceptron networks have several limitations. Perceptron networks have several limitations. • First, the output values of a perceptron can take on only one of First, the output values of a perceptron can take on only one of

two values (0 or 1) due to the hard limit transfer function.two values (0 or 1) due to the hard limit transfer function.• Second, perceptrons can only classify linearly separable sets Second, perceptrons can only classify linearly separable sets

of vectors. If a straight line or a plane can be drawn to of vectors. If a straight line or a plane can be drawn to separate the input vectors into their correct categories, the separate the input vectors into their correct categories, the input vectors are linearly separable. If the vectors are not input vectors are linearly separable. If the vectors are not linearly separable, learning will never reach a point where all linearly separable, learning will never reach a point where all vectors are classified properly.vectors are classified properly.

• Note, however, that it has been proven that if the vectors are Note, however, that it has been proven that if the vectors are linearly separable, perceptrons trained adaptively will always linearly separable, perceptrons trained adaptively will always find a solution in finite time.find a solution in finite time.


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИADALINE (Adaptive Linear Neuron networks)


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИ MADALINE (Multiple Neuron Adaptive Filters)


НЕВРОННИ МРЕЖИНЕВРОННИ МРЕЖИSummary

• ADALINEs may only learn linear relationships between input and output vectors. Thus ADALINEs cannot find solutions to some problems. However, even if a perfect solution does not exist, the ADALINE will minimize the sum of squared errors if the learning rate lr is sufficiently small. The network will find as close a solution as is possible given the linear nature of the network’s architecture. This property holds because the error surface of a linear network is a multi-dimensional parabola. Since parabolas have only one minimum, a gradient descent algorithm (such as the LMS rule) must produce a solution at that minimum.

• Multiple layers in a linear network do not result in a more powerful network so the single layer is not a limitation. However, linear networks can solve only linear problems.

nn07

Documents