cs4001 neuro-fuzzy systems lect6

Neuro-fuzzy Systems

Khurshid Ahmad,

1

1

Khurshid Ahmad, Professor of Computer Science,

Department of Computer ScienceTrinity College,

Dublin-2, IRELAND21th November 2012.

https://www.cs.tcd.ie/Khurshid.Ahmad/Teaching.html

Neuro-fuzzy models

A fuzzy inference system can be shown to be functionally equivalent to a class of adaptive networks.

2

The burden of specifying the parameters of the fuzzy inference can be transferred to an algorithm that attempts to learn the value of the parameters

Jang, Jyh-Shing Roger., Sun, Chuen-Tsai & Mizutani, Eiji. (1997). Neuro-Fuzzy & Soft Computing: A Computational Approach to Learning and Machine Intelligence. Upper Saddle River (NJ): Prentice Hall, Inc. (Chapters 8 and 12)

Neuro-fuzzy models

For complex control systems, there is a wealth of observational a priori knowledge related to the behaviour of inputs and output(s).

The input space may be partitioned –between

3

The input space may be partitioned –between say normal behaviour and abnormal behaviour inducing stimuli.The corresponding output can be noted


Neuro-fuzzy models

Learn from the input-output data:

• Data mining;• Machine Learning;

• Neural Networks; }

4

• Neural Networks; }• Genetic Algorithms• Hybrids �� Neuro Fuzzy systems


Soft Computing

Neuro-fuzzy models

Learn from the input-output data:

• If a soft computing system is

5

• If a soft computing system is able to compute the input-output relationships, then it will LEARN to compute the relationshipsJang, Jyh-Shing Roger., Sun, Chuen-Tsai & Mizutani, Eiji. (1997). Neuro-Fuzzy & Soft Computing: A Computational Approach to Learning and Machine Intelligence. Upper Saddle River (NJ): Prentice Hall, Inc. (Chapters 8 and 12)

Neuro-fuzzy modelsLearn from the input-output data:

The key notion in learning is that of THRESHOLD

Threshold is an Old English word meaning the piece of

6

timber or stone which lies below the bottom of a door, and

has to be crossed in entering a house; the sill of a doorway;

hence, the entrance to a house or building.

More technically, in contexts of wages and taxation, in

which wage or tax increases become due or obligatory when

some predetermined conditions are fulfilled (esp. above a

specified point on a graduated scale). [..]



Threshold in many specialist domains refers to a lower

7

limit.

(i) Psychology: esp. in phrase threshold of consciousness.

(ii) In Physiology and more widely:

(a) the limit below which a stimulus is not

perceptible;

(b) the magnitude or intensity that must be exceeded

for a certain reaction or phenomenon to occur.




8

limit.

(iii) In Electronics: (a) threshold device, element, etc.: a circuit element having one output

and a number of inputs, each of which accepts a binary signal and

multiplies it by some factor; the output is 0 or 1 depending on

whether or not the sum of the resulting quantities is less than a

certain threshold value;

(b) threshold function, a Boolean function that can be realized by such

an element; threshold logic, switching (based on such elements).




9

limit.

(iv) In Fuzzy Logic and Fuzzy Knowledge Bases, rules are

fired if the aggregation of the antecedents’ membership

functions is non-zero. The threshold value here is any

number greater than zero.



Threshold functions

ywxwv

≥

+= 21

10avFunctionSigmoid

vif

vifv

vif

FunctionLinearPiecewise

vif

vifFunctionThreshold

−+=

−≤

−>>+

+≥

=

<

≥=

exp1

1(v))3(

2

10

2

1

2

12

11

(v))2(

00

01(v))1(

φ

φ

φ

Neuro-fuzzy models:A case study

Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).

11

There are two fuzzy rules:

R1: IF x is A1 and y is B1 THEN f1=p1x+q1y+r1

R2: IF x is A2 and y is B2 THEN f2=p2x+q2y+r2



A1 w w

Layer 1 Layer 2

Layer 5

12

A1

A2

B1

B2

TT

TT

N

N

∑

x

y

w1

w2

w1

w2

w1f1

w2f2

f

Layer 3 Layer 4

Layer 5


1

1.2


B1

1.2

13

0

0.2

0.4

0.6

0.8

1

0 2 4 6 8 10 12

Series1

0

0.2

0.4

0.6

0.8

1

1.2

0 2 4 6 8 10 12

y

Series1

w1

f1=p1x+q1y+r1

f=(w1 f1+w2 f2)/(w1+w2)


The operation of a fuzzy system depends on the execution of FOUR major tasks:

Fuzzification, Inference,

14

Inference, Composition,

(Defuzzification).

The different layers in an adaptive network perform one or more of the tasks



A1 w w

Layer 1 Layer 2

Layer 5

15

A1

A2

B1

B2

TT

TT

N

N

∑

x

y

w1

w2

w1

w2

w1f1

w2f2

f

Layer 3 Layer 4

Layer 5


One can argue that the first layer, that receives input from the external world, actually performs fuzzification.

16

Recall that fuzzification involves the choice of variables, fuzzy inputand output variables and defuzzified output variable(s), definitionof membership functions for the input variables and the descriptionof fuzzy rules. The membership functions defined on the inputvariables are applied to their actual values to determine the degreeof truth for each rule premise. The degree of truth for a rule'spremise is sometimes referred to as its a (alpha) value. If a rule'spremise has a non-zero degree of truth, that is if the rule applies atall, then the rule is said to fire.



LAYER 1: Every node i in this layer is an adaptive node with a node function

17

4,3)(

2,1)(

2,1

,1

==

==

−iforyO

iforxO

i

i

Bi

Ai

µ

µib

i

i

A

a

cxx

2

1

1)(

−+

=µ

PREMISE PARAMETER SET:={ai,bi,ci)


The operation of the 2nd and 3rd layers in an adaptive network may be construed as equivalent to that of performing inference.

18

performing inference.In lectures on knowledge representation we had defined inferenceas follows: The truth-value for the premise of each rule is computedand the conclusion applied to each part of the rule. This results inone fuzzy subset assigned to each output variable for each rule. MINand PRODUCT are two inference methods. In MIN inferencing theoutput membership function is clipped off at a height correspondingto the computed degree of truth of a rule's premise. Thiscorresponds to the traditional interpretation of the fuzzy logic'sAND operation. In PRODUCT inferencing the output membershipfunction is scaled by the premise's computed degree of truth.



LAYER 2: Every node in this layer is a fixed node (∏); the node outputs the product of all incoming signals

19

4,3;2,1)(*)(2,2 ====

−jiforyxwO

jBAii iµµ

Each node in this layer represents the firing strength of a rule; fuzzy AND operator can be used



LAYER 3: Every node in this layer is a fixed node (N); the ith node calculates the ratio of the ith rule’s firing strength to the sum of all rules’ firing strengths

20

strength to the sum of all rules’ firing strengths

2,121

_

,3 =+

== iforww

wwO i

ii

Outputs of layer 3 are called NORMALIZED FIRING STRENGTHS


The operation of the (3rd &) 4th

layer(s) involves composition. You may remember our definition of composition: All the fuzzy subsets assigned

21

You may remember our definition of composition: All the fuzzy subsets assigned

to each output variable are combined together to form a single fuzzy subset for

each output variable. MAX and SUM are two composition rules. In MAX

composition, the combined fuzzy subset is constructed by taking the pointwise

maximum over all the fuzzy subsets assigned to the output variable by the

inference rule. The SUM composition, the combined output fuzzy subset is

constructed by taking the pointwise sum over all the fuzzy subsets assigned to

output variable by their inference rule. (Note that this can result in truth values

greater than 1).



LAYER 4: Every node in this layer is an adaptive node with a node function:

22

)

__

,4 (* iiiiiii ryqxpwfwO ++==

The normalized firing strengths wi is a normalised firing strength from layer 3;The parameter set, {pi,qi,ri} is the so-called CONSEQUENT PARAMETERS SET


And, finally the output layer of an adaptive network performs the equivalent of defuzzification.

23

Defuzzification was defined as process where the value from the composition stage needs to be converted to a single number or a crisp value. Two popular defuzzification techniques are the CENTROID and MAXIMUM techniques. The use of CENTROID technique relies on using the centre of gravity of the membership function to calculate the crisp value of the output variable. The MAXIMUM techniques, and there are a number of them, broadly speaking, use one of the variable values at which the fuzzy subset has its maximum truth value to compute the crisp value.



LAYER 5: The single node in this layer is a fixed node

labelled ∑, which computes the overall output as the

24

labelled ∑, which computes the overall output as the

summation of all incoming signals

∑∑

∑ ==

i

i

i

ii

i

iiw

fw

fwO_

1,5


The network below is an adaptive network that is functionally equivalent to a Takagi-Sugeno model.

A1 w w

Layer 1 Layer 2

Layer 5

25

A1

A2

B1

B2

TT

TT

N

N

∑

x

y

w1

w2

w1

w2

w1f1

w2f2

f

Layer 3 Layer 4

Layer 5

Neuro-fuzzy models

26

Neuro-fuzzy models

The goal of a number of statistical investigations is to predict the variation of a dependent variable on one or more independent variables using a mathematical equation. The dependence can be linear or non-linear.

y

27

Consider the linear dependence of a variable y on independent variable

x, sometimes with the proviso that the independent variables can be

observed without any (observational) error. The dependent variable y

may have different values for the SAME x.

One can argue that y is essentially a random variable and its distribution depends on x; typically, the quest is to find the relationship between the independent variable and the MEAN of the dependent variable y – the regression curve of y on x.

Neuro-fuzzy models

xy βα +=ˆ

Assume that the dependence of y on x is linear � for any given x

the MEAN of the distribution of y is given as

28

xy βα +=ˆStatisticians remind us that an observed y will differ from the mean ŷ by the value of a random variable, say, ε

εβα ++= xy

Neuro-fuzzy models

xy βα +=ˆ



29

xy βα +=ˆStatisticians remind us that an observed y will differ from the mean ŷ by the value of a random variable, say, ε

εβα ++= xy

Neuro-fuzzy models

xy βα +=ˆ



30

xy βα +=ˆStatisticians remind us that an observed y will differ from the mean ŷ by the value of a random variable, say, ε � the value may be related to the possible error of measurement and related to other variables that may have an influence on y

εβα ++= xy

Neuro-fuzzy modelsWhat we have to do now is to use an OBSERVED data set containing the tuples {xi,yi} for a number of observations, i=1,N, for estimating the values of α and β.

Given that we have assumed that the relation between x and y is linear, then we have to find a straight line that may provide a fit.

31

linear, then we have to find a straight line that may provide a fit.There could be many straight lines that can be fitted to the data set and we have to chose the best one. We begin by predicting the value of y using estimates of α and β, which we will refer to

as a and b

xbay +=ˆ

Neuro-fuzzy models

xbay +=ˆThe error in predicting the value of y given a corresponding

32

The error in predicting the value of y given a corresponding value of x, will be denoted as the error vector ei:

yye ii ˆ−=

Neuro-fuzzy models

yye ii ˆ−=Typically, instead of computing ei, a difficult task, we tend to reduce the sum of errors associated with the N observations to

33

reduce the sum of errors associated with the N observations to zero. This is rather unsuitable, as one find totally unsuitable lines, one tends to minimize the value of the sum of the squares of ei

22 )]([ i

i

i

i

i

i

i bxayee +−=⇒ ∑∑∑

Neuro-fuzzy models

22 )]([ i

i

i

i

i

i

i bxayee +−=⇒ ∑∑∑Essentially, we equate the (partial) derivatives of the above equation with respect to a and b to zero and we get normal

34

equation with respect to a and b to zero and we get normal equations

∑∑ ∑

∑∑

∑

∑

+=

+=

=−+−

=−+−

i

i

i i

iii

i

i

i

i

ii

i

i

i

i

i

xbxayx

xbnay

toLeading

xbxay

bxay

2

*

0))](([2

0)1)](([2

Neuro Fuzzy Models• Statisticians generally have good mathematical backgrounds with which to analyse decision-making algorithms theoretically. […] However, they often pay little or no attention to the applicability of their own theoretical results’ (Raudys 2001:xi).

35

• Neural network researchers ‘advocate that one should not

make assumptions concerning the multivariate densities assumed for pattern classes’ . Rather, they argue that ‘one should assume only the structure of decision making rules’ and hence there is the emphasis in the minimization of classification errors for instance.

Raudys, Šarûunas. (2001). Statistical and Neural Classifiers: An integrated approach to design. London: Springer-Verlag

Neuro Fuzzy Models

• In neural networks there are algorithms that have a theoretical justification and some have no theoretical elucidation’.

•Given that there are strengths and

36

•Given that there are strengths and weaknesses of both statistical and other soft computing algorithms (e.g. neural nets, fuzzy logic), one should integrate the two classifier design strategies (ibid)


Neuro Fuzzy Models• The ‘remarkable qualities’ of neural networks: the dynamics of a single layer perceptron progresses from the simplest algorithms to the most complex algorithms:

• Initial Training � each pattern class characterized by sample mean vector �neuron behaves like EDC � ;

37

neuron behaves like EDC � ;• Further Training � neuron begins to evaluate correlations and variances of features � neuron behaves like standard linear Fischer classifier• More training � neuron minimizes number of incorrectly identified training patterns � neuron behaves like a support vector classifier.

• Statisticians and engineers usually design decision-making algorithms from experimental data by progressing from simple algorithms to more complex ones.


Neuro-fuzzy models

Adaptive NetworksA network typically comprises a set of nodes connected by directed links.

Each node performs a static node function on its incoming signals to generate a single node output.

38

incoming signals to generate a single node output.

Each link specifies the direction of signal flow from one node to another.

An adaptive network is a network structure whose overall input-output behaviour is determined by a collection of modifiable parameters.

Neuro-fuzzy modelsDirected graphs

•The nodes of a directed graph are called processing elements. •The links of the graph are called connections. •Each connection functions as an instantaneous

39

•Each connection functions as an instantaneous unidirectional signal-conduction path.•Each processing element can receive any number of incoming connections, sometimes called input connections.•Each processing element can have any number of outgoing connections but the signals in all of these must be the same.


•The nodes of a directed graph are called processing elements. •The links of the graph are called connections. •Each connection functions as an instantaneous

Output Connection

Input Connections Processing Unit

Fan Out

40

unidirectionalsignal-conduction path.•Each processing element can receive any number of incoming connections, sometimes called input connections.•Each processing element can have any number of outgoing connections but the signals in all of these must be the same.


In effect, each processing element has a single output connection that can branch or fan outinto copies to form multiple output connections (sometimes called collaterals), each of which

41

carries the same identical signal (the processing element's output signal).

Processing elements can have local memory.

Each processing element possesses a transfer function which can use (and alter) local memory, can use input signals, and which produces the processing element's output signal.


The only inputs allowed to the transfer function are the values stored in the processing element's local memory and the current values of the input signals in the connections received by the

42

the connections received by the processing element.

The only outputs allowed from the transfer function are values to be stored in the processing element's local memory and the processing element's output signal.


Transfer functions can operate continuously or episodically.

If they operate episodically, there must be an input called "activate" that causes the processing element's transfer

43

"activate" that causes the processing element's transfer function to operate on the current input signals and local memory values and to produce an updated output signal (and possibly to modify local memory values).

Continuous processing elements are always operating. The "activate" input arrives via a connection from a schedulingprocessing element that is part of the network.


Input Connections Processing Unit

Output Connection

Fan Out

44

Real NeuroscienceBrains compute. This means that they process

information, creating abstract representations of physical

entities and performing operations on this information in

order to execute tasks. One of the main goals of

computational neuroscience is to describe these

45London, Michael and Michael Häusser (2005). Dendritic Computation. Annual Review of Neuroscience. Vol. 28, pp 503–32


transformations as a sequence of simple elementary steps

organized in an algorithmic way. The mechanistic

substrate for these computations has long been debated.

Traditionally, relatively simple computational properties

have been attributed to the individual neuron, with the

complex computations that are the hallmark of brains

being performed by the network of these simple elements.

DEFINITIONS: Artificial Neural Networks

Artificial neural networks emulate threshold

behaviour, simulate co-operative phenomenon by a

network of 'simple' switches and are used in a

variety of applications, like banking, currency

trading, robotics, and experimental and animal

46

trading, robotics, and experimental and animal

psychology studies.

These information systems, neural networks or

neuro-computing systems as they are popularly

known, can be simulated by solving first-order

difference or differential equations.

What computers can do? Artificial Neural Networks

In a restricted sense artificial neurons are simple emulations of biological neurons: the artificial neuron can, in principle, receive its input from all other artificial neurons in the ANN; simple operations are performed on the input data; and, the recipient neuron can, in

47

on the input data; and, the recipient neuron can, in principle, pass its output onto all other neurons.

Intelligent behaviour can be simulated through computation in massively parallel networks of simple processors that store all their long-term knowledge in the connection strengths.

DEFINITIONS:Neurons & Appendages

A neuron is a cell with appendages; every cell has a nucleus

and the one set of appendages brings in inputs – the dendrites

– and another set helps to output signals generated by the cell

The Real McCoy

48

AXONCELL BODY

NUCLEUS

DENDRITES

The Real McCoy


The human brain is mainly composed of neurons: specialized cells

Dendrite

Soma

AxonTerminals

49

specialized cells that exist to transfer information rapidly from one part of an animal's body to another.

SOURCE:http://en.wikipedia.org/wiki/Neurons

Nucleus


This communication is achieved by the transmission (and reception) of electrical impulses (and chemicals)

Dendrite

Soma

AxonTerminals

50

impulses (and chemicals) from neurons and other cells of the animal. Like other cells, neurons have a cell body that contains a nucleus enshrouded in a membrane which has double-layered ultrastructure with numerous pores.


Nucleus


Neurons have a variety of appendages, referred to as 'cytoplasmic processes known as neurites which end in close apposition to other cells. In higher animals, neurites are of two varieties: Axons are processes of

Dendrite

Soma

AxonTerminals

51

varieties: Axons are processes of generally of uniform diameter and conduct impulses away from the cell body; dendrites are short-branched processes and are used to conduct impulses towards the cell body.

The ends of the neurites, i.e. axons and dendrites are called synaptic terminals, and the cell-to-cell contacts they make are known as synapses.


Nucleus

DEFINITIONS: The fan-ins and fan-outs

–+ 10 fan-in

4

1010 neurons with 104 connections and an average of 10 spikes per second

= 1015 adds/sec. This is a lower bound on the equivalent computational

power of the brain.

52

–

summation

1 - 100 meters per sec.

Asynchronous

firing rate,

c. 200 per sec.

10 fan-out4

Notes on Artificial Neural Networks

Input signals to a neural network from outside the network arrive via connections that originate in

53

via connections that originate in the outside world.

Outputs from the network to the outside world are connections that leave the network.


•Artificial Neural Networks (ANN) are computational systems, either hardware or software, which mimic animate neural systems comprising

54

animate neural systems comprising biological (real) neurons. •An ANN is architecturally similar to a biological system in that the ANN also uses a number of simple, interconnected artificial neurons.

Observed Biological Processes (Data)


55http://en.wikipedia.org/wiki/Neural_network#Neural_networks_and_neuroscience

Neural Networks &Neurosciences

Biologically PlausibleMechanisms for Neural Processing & Learning

(Biological Neural Network Models)

Theory(Statistical Learning Theory &

Information Theory)

Brain – The Processor!

56http://www.cs.duke.edu/brd/Teaching/Previous/AI/pix/noteasy1.gif

Real NeuroscienceBrains compute. This means that they process

information, creating abstract representations of physical

entities and performing operations on this information in

order to execute tasks. One of the main goals of


57London, Michael and Michael Häusser (2005). Dendritic Computation. Annual Review of Neuroscience. Vol. 28, pp 503–32


transformations as a sequence of simple elementary steps

organized in an algorithmic way. The mechanistic

substrate for these computations has long been debated.

Traditionally, relatively simple computational properties

have been attributed to the individual neuron, with the

complex computations that are the hallmark of brains

being performed by the network of these simple elements.


Neural Networks 'learn' by adapting in accordance with a training regimen: The network is subjected to particular information environments on a particular schedule to achieve the desired end-result.

58

There are three major types of training regimens or learning paradigms:

SUPERVISED;UN-SUPERVISED;

REINFORCEMENT or GRADED.

Notes on Artificial Neural NetworksNeurons & Appendages

A neuron is a cell with appendages; every cell has a nucleus and the one set of appendages brings in inputs – the dendrites – and another set helps to output signals generated by the cell

59

AXONCELL BODY

NUCLEUS

DENDRITES

Notes on Artificial Neural Networks:The fan-ins and fan-outs

–+ 10 fan-in

4

1010 neurons with 104 connections and an average of 10 spikes per second = 1015 adds/sec. This is a lower bound on the equivalent computational power of the brain.

60

–

summation

1 - 100 meters per sec.

Asynchronous

firing rate,

c. 200 per sec.

10 fan-out4

Notes on Artificial Neural Networks:Biological and Artificial NN’s

Entity Biological Neural Networks

Artificial Neural Networks

Processing Units Neurons Network Nodes

61

Input Dendrites Network Arcs

Output Axons Network Arcs

Inter-linkage Synaptic Contact (Chemical and Electrical)

Node to Node via Arcs

Connectivity Plastic Connections Weighted Connections Matrix


62http://brainmaps.org/index.php?p=brain-connectivity-maps-imagemap


63http://brainmaps.org/index.php?p=brain-connectivity-maps-imagemap

413 major areas in animal brain

Areas connected to each other

Some more connected than others

Notes on Artificial Neural Networks:An operational view of Artificial NN’s

ΣΣΣΣ

wk1

wk2

Neuron xkx1

x2

Input Signals

Summing

Junction

Activation

Function

64A schematic for an 'electronic' neuron

ykΣΣΣΣwk3

wk4

x3

x4bk

Input Signals

Output S

ignal

Notes on Artificial Neural Networks:An operational view of Artificial NN’s

A neural network comprisesA set of processing units

A state of activation

An output function for each unit

A pattern of connectivity among units

65

A pattern of connectivity among units

A propagation rule for propagating patterns of activities

through the network

An activation rule for combining the inputs impinging on

a unit with the current state of that unit to produce a

new level of activation for the unit

A learning rule whereby patterns of connectivity are

modified by experience

An environment within which the system must operate

Notes on Artificial Neural Networks: Rosenblatt’s Perceptron

Logic Gate: A digital circuit that implements an elementary logical operation. It has one or more inputs but ONLY one output. The conditions applied to the input(s) determine the voltage levels at the output. The output, typically, has two values � ‘0’ or ‘1’.

66

values � ‘0’ or ‘1’.

Digital Circuit: A circuit that responds to discrete values of input (voltage) and produces discrete values of output (voltage).Binary Logic Circuits: Extensively used in computers to carry out instructions and arithmetical processes. Any logical procedure maybe effected by a suitable combinations of the gates. Binary circuits are typically formed from discrete components like the integrated circuits.


Logic Circuits: Designed to perform a particular logical function based on AND, OR (either), and NOR (neither). Those circuits that operate between two discrete (input) voltage levels, high & low, are described as binary logic circuits.

67

& low, are described as binary logic circuits.

Logic element: Small part of a logic circuit, typically, a logic gate, that may be represented by the mathematical operators in symbolic logic.

Gate Input(s) Output

AND Two

(or more)

High if and only if both (or all) inputs are high.

NOT One High if input low and vice


68

NOT One High if input low and vice versa

OR Two

(or more)

High if any one (or more) inputs are high

Input 1 Input 2 Output

0 0 0

0 1 0

The operation of an AND gate


69

0 1 0

1 0 0

1 1 1

AND (x,y)= minimum_value(x,y);AND (1,0)=minimum_value(1,0)=0;AND (1,1)=minimum_value(1,1)=1

A single layer perceptron can perform a number of logical operations which are performed by a number of computational devices.

x 1

w=+1

A hard-wired perceptron


70

x 2

w=+1 1

w=+1 2

Σ=w1x1+w2x2+θ

θ = -1.5

y=1 if Σ≥ 0; y=0 ifΣ< 0

perceptron below performs the AND operation.This is hard-wired because the weights are predetermined and not learnt


A learning perceptron

An algorithm: Train the network for a number of epochs(1) Set initial weights w1 and w2 and the bias θ to set of random numbers; (2) Compute the weighted sum:

x1*w1+x2*w2+ θ


71

perceptron below performs the AND operation.

x1*w1+x2*w2+ θ(3) Calculate the output using a delta function

y(i)= delta(x1*w1+x2*w2+ θ ); delta(x)=1, if x is greater than zero,

delta(x)=0,if x is less than or equal to zero

(4) compute the difference between the actual output and

desired output:

e(i)= ydesired- y(i)(5) If the errors during a training epoch are all zero then stop otherwise update

wj(i+1)=wj(i)+ α*xj*e(i) , j=1,2

A single layer perceptron can perform a number of logical operations which are performed by a number of computational devices:

αααα=0.1Θ=-0.1

Epoch X1 X2 Ydesire Initial Weights Actual Error Final Weights


72

Epoch X1 X2 Ydesire

d

Initial

W1

Weights

W2

Actual Output

Error Final

W1

Weights

W2

1 0 0 0 0.3 -0.1 0 0 0.3 -0.1

0 1 0 0.3 -0.1 0 0 0.3 -0.1

1 0 0 0.3 -0.1 1 -1 0.2 -0.1

1 1 1 0.2 -0.1 0 1 0.3 0.0


Epoch X1 X2 Ydesire

d

Initial

W1

Weights

W2

Actual Output

Error Final

W1

Weights

W2

2 0 0 0 0.3 0.0 0 0 0.3 0.0


73

2 0 0 0 0.3 0.0 0 0 0.3 0.0

0 1 0 0.3 0.0 0 0 0.3 0.0

1 0 0 0.3 0.0 1 -1 0.2 0.0

1 1 1 0.2 0.0 1 0 0.2 0.0


Epoch X1 X2 Ydesire

d

Initial

W1

Weights

W2

Actual Output

Error Final

W1

Weights

W2

3 0 0 0 0.2 0.0 0 0 0.2 0.0


74

3 0 0 0 0.2 0.0 0 0 0.2 0.0

0 1 0 0.2 0.0 0 0 0.2 0.0

1 0 0 0.2 0.0 1 -1 0.1 0.0

1 1 1 0.1 0.0 0 1 0.2 0.1


Epoch X1 X2 Ydesire

d

Initial

W1

Weights

W2

Actual Output

Error Final

W1

Weights

W2

4 0 0 0 0.2 0.1 0 0 0.2 0.1


75

4 0 0 0 0.2 0.1 0 0 0.2 0.1

0 1 0 0.2 0.1 0 0 0.2 0.1

1 0 0 0.2 0.1 1 -1 0.1 0.1

1 1 1 0.1 0.1 1 0 0.1 0.1


Epoch X1 X2 Ydesire

d

Initial

W1

Weights

W2

Actual Output

Error Final

W1

Weights

W2

5 0 0 0 0.1 0.1 0 0 0.1 0.1


76

5 0 0 0 0.1 0.1 0 0 0.1 0.1

0 1 0 0.1 0.1 0 0 0.1 0.1

1 0 0 0.1 0.1 0 0 0.1 0.1

1 1 1 0.1 0.1 1 0 0.1 0.1

Neuro Fuzzy Models• The ‘remarkable qualities’ of neural networks: the dynamics of a single layer perceptron progresses from the simplest algorithms to the most complex algorithms:

• Initial Training � each pattern class characterized by sample mean vector �neuron behaves like EDC � ;

77

neuron behaves like EDC � ;• Further Training � neuron begins to evaluate correlations and variances of features � neuron behaves like standard linear Fischer classifier• More training � neuron minimizes number of incorrectly identified training patterns � neuron behaves like a support vector classifier.

• Statisticians and engineers usually design decision-making algorithms from experimental data by progressing from simple algorithms to more complex ones.


cs4001 neuro-fuzzy systems lect6

Documents

neurofuzzy soft computing

inputoutput data

inputoutput relationships

soft computing system

modelsa fuzzy inference

input space

machine intelligence

computational approach