wissenschaftliches arbeiten - auswertung folie 1 artificial neural networks uwe lämmel wismar...

Wissenschaftliches Arbeiten - Auswertung

Folie 1

ArtificialNeural Networks

Uwe Lämmel

Wismar Business

School

www.wi.hs-wismar.de/~laemmel

Uwe.Laemmel@hs-wismar.de

Folie 2

Folie 3

Literature & Software

– Robert Callan: The Essence of Neural Networks, Pearson Education, 2002.

– JavaNNS based on SNNS:Stuttgarter Neuronale Netze Simulatorhttp://www.ra.cs.uni-tuebingen.de/software/JavaNNS/

Folie 4

Prerequisites

NO algorithmic solution availableor algorithmic solution too time consuming

NO knowledge-based solution LOTS of experience (data)

Try a NN

Folie 5

Content

Idea An artificial Neuron – Neural Network Supervised Learning – feed-forward networks Competitive Learning – Self-Organising Map Applications

Folie 6

Two different types of knowledge processing

Logic Conclusion

sequential Aware of Symbol processing,

Rule processing precise Engineering „traditional" AI

Perception, Recognition

parallel Not aware of Neural Networks

fuzzy Cognitive oriented Connectionism

Folie 7

A human being learns by example “learning by doing”

– seeing(Perception), Walking, Speaking,…

Can a machine do the same?

A human being uses his brain. A brain consists of millions of single cells.A cell is connected with ten thousands of other cell.

Is it possible to simulate a similar structure on a computer?

Folie 8

Artificial Neural Network

– Information processing similar to processes in a mammal brain

– heavy parallel systems,

– able to learn

– great number of simple cells

? Is it useful to copy nature ?

– wheel, aeroplane, ...

Folie 9

we need: software neurons software connection between neurons software learning algorithms

An artificial neural network functions in a similar way

a natural neural network does.

Folie 10

A biological Neuron

cell andcell nucleus

Axon(Neurit)

Dendrits

Synapse

• Dendrits: (Input) Getting other

activations• Axon: (Output ) forward the

activation (from 1mm up to 1m long)

• Synapse: transfer of activation:– To other cells, e.g.. Dendrits of

other neurons– a cell has about 1.000 to 10.000

connections to other cells• Cell Nucleus: (processing)

evaluation of activation

Folie 11

Abstraction

Dendrits: weighted connectionsweight: real number

Axon: output: real number

Synapse: --- (identity: output is directly forwarded)

Cell nucleus: unit contains simple functionsinput = (many) real numbersprocessing = evaluation of activationoutput = real number (~activation)

Folie 12

An artificial Neuron

net : input from the networkw : weight of a connectionact : activationfact : activation function

: bias/thresholdfout : output function (mostly ID)

o : output

jjii ownet

),( iiacti netfact

)( iouti actfo

w1i w2i wji...

Folie 13

A simple switch

Set parameters according to function:– Input neurons 1,2 :

a1,a2 input pattern,here: oi=ai

– weights of edges: w1, w2

– bias Give values for w1, w2 , we can evaluate output o

a1=__ a2=__

net= o1w1+o2

w2a = 1, if net>= 0, otherwise

w1=__ w2=__

Folie 14

Questions

find values for the parameters so that a logic function is simulated:

– Logical AND

– Logical OR

– Logical exclusive OR (XOR)

– Identity We want to process more than 2

inputs. Find appropriate parameter values.

– Logical AND, 3 (4) inputs

– OR, XOR iff 2 out of 4 are 1

Folie 15

Mathematics in a Cell

Propagation function neti(t) = ojwj = w1i o1 + w2i o2 + ...

Activationai(t) – Activation at time t

Activation function fact :ai(t+1) = fact(ai(t), neti(t), i) i – bias

Output function fout : oi = fout(ai)

Folie 16

Bias function

-4,0 -2,0 0,0 2,0 4,0

Identity

-4,0 -2,5 -1,0 0,5 2,0 3,5

Activation Functions

activation functions are

sigmoid functions

Folie 17

y = tanh(c·x)

-0,6 0,6 1,0

c=1c=2

Activation Functions

y = 1/(1+exp(-c·x))

-1,0 0,0 1,0

c=1c=3

Logistic function:

activation functions are

sigmoid functions

Folie 18

Structure of a network

layers input layer – contains input neurons output layer – contains output neurons hidden layer – contains hidden neurons

An n-layer network has: n layer of connections which can be

trained n+1 neuron layers n –1 hidden layers

Folie 19

Neural Network - Definition

A Neural Network is characterized by connections of many (a lot of) simple units (neurons)

and units exchanging signals via these connections

A neural Network is a coherent, directed graph which has weighted edges and each node (neurons, units ) contains a value (activation).

Folie 20

Elements of a NN

Connections/Links – directed, weighted graph

– weight: wij (from cell i to cell j)

– weight matrix Propagation function

– network input of a neuron will be calculated:

neti = ojwji Learning algorithm

Folie 21

ExampleXOR-Network

1 20 1

1 10 1

31 1 10 1,5 1

4 10,5

weight matrix wij: i\j 1 2 3 4 1 0 0 1 1 2 0 0 1 1 3 0 0 0 -2 4 0 0 0 0

Folie 22

Supervised Learning – feed-forward networks

Idea An artificial Neuron – Neural Network Supervised Learning – feed-forward

networks– Architecture– Backpropagation Learning

Competitive Learning – Self-Organising Map Applications

Folie 23

Multi-layer feed-forward network

Folie 24

Feed-Forward Network

Folie 25

Evaluation of the net output

Oj=actj

Ok=act

Input-Layer hidden Layer(s) Output Layer

Folie 26

Backpropagation Learning Algorithm

supervised Learning error is a function of the weights wi :

E(W) = E(w1,w2, ... , wn) We are looking for a minimal error minimal error = hollow in the error surface Backpropagation uses the gradient

for weight approximation.

Folie 27

error curve

Folie 28

Problem

error in output layer:difference output – teaching output

error in a hidden layer?

output teaching output

input layer

hidden layer

Folie 29

Mathematics

– modifying weights according to the gradient of the error function

W = - E(W)

E(W) is the gradient

is a factor, called learning parameter

-1 -0,6 -0,2 0,2 0,6 1

Folie 30

Mathematics

Here: modification of weights:

W = – E(W)

– E(W): Gradient

– Proportion factor for the weight vector W, : learning factor

E(Wj) = E(w1j,w2j, ..., wnj)

Folie 31

Error Function

– Error functionquadratic distance between real and teaching output of all patterns p:

– tj - teaching output– oj - real output

– Now: error for one pattern only (omitting pattern index p):

jj otE

– Modification of a weight:

ijij w

Folie 32

Backpropagation rule

Multi layer networks Semi linear Activation function

(monotone, differentiable, e.g. logistic function)

Problem: no teaching outputs for hidden neurons

Folie 33

Backpropagation Learning Rule

Start:

6.1 in more detail:

dependencies: ))(( jactoutj netffo

fout = Id

ijij w

ijij wonet

Folie 34

)1())(1()( jjjLogisticjLogisticj

joonetfnetf

The 3rd and 2nd Factor

3rd Factor:dependency net input – weights

)( jactj

j netfnet

2nd Factor:

derivation of the activation function:

kjkijij

j owoww

Folie 35

The 1st Factor

1st Factor: dependency error – output

Error signal of hidden neuron j:

(6.10)

– Error signal of output neuron j: 2)(2

jj otE

ototoo

k iiki

j : error signal

Folie 36

Error Signal

j = f’act(netj)·(tj – oj) Output neuron j:

Hidden neuron j:

jjj net

j = f’act(netj) · kwjk

(6.12)

(6.11)

Folie 37

Standard Backpropagation Rule

For the logistic activation function:f ´act(netj ) = fact(netj )(1 – fact(netj )) = oj (1 –oj)

Therefore:

Neuron Outputjif)()1(

Neuronhiddenjif)1(

kjkkjj

and:jiij ow

jiijij oww '

Folie 38

error signal for fact = tanh

For the activation function tanh holds:f´act(netj ) = (1 – f ²act(netj )) = (1 – tanh² oj )

therefore:

neuronoutput jif,)()tanh1(

neuronhiddenjif,)tanh1(

Folie 39

Backpropagation - Problems

Folie 40

Backpropagation-Problems

– A: flat plateau – backpropagation goes very slowly– finding a minimum takes a lot of time

– B: Oscillation in a narrow gorge– it jumps from one side to the other and back

– C: leaving a minimum – if the modification in one training step is to high,

the minimum can be lost

Folie 41

Solutions: looking at the values

change the parameter of the logistic function in order to get other values

Modification of weights depends on the output:if oi=0 no modification will take place

If we use binary input we probably have a lot of zero-values: Change [0,1] into [-½ , ½] or [-1,1]

use another activation function, eg. tanh and use [-1..1] values

Folie 42

Solution: Quickprop

assumption: error curve is a square function calculate the vertex of the curve

)1()()1(

tStw ijij

-2 2 6

slope of the error curve:

Folie 43

Resilient Propagation (RPROP)

– sign and size of the weight modification are calculated separately: bij(t) – size of modification

bij(t-1) + if S(t-1)S(t) > 0bij(t) = bij(t-1) - if S(t-1)S(t) < 0

bij(t-1) otherwise

+>1 : both ascents are equal „big“ step0<-<1 : ascents are different „smaller“ step

-bij(t) if S(t-1)>0 S(t) > 0wij(t) = bij(t) íf S(t-1)<0 S(t) < 0

-wij(t-1) if S(t-1)S(t) < 0 (*) -sgn(S(t))bij(t) otherwise

(*) S(t) is set to 0, S(t):=0 ; at time (t+1) the 4th case will be applied.

Folie 44

Limits of the Learning Algorithm

– it is not a model for biological learning

– we have no teaching output in a natural learning process

– In a natural neural network there are no feedbacks (at least nobody has discovered yet)

– training of a artificial neural network is rather time consuming

Folie 45

Development of an NN-application

calculate network output compare to

teaching output

use Test set data

evaluate output

compare to teaching output

change parameters

modify weights

input of training pattern

build a network architecture

quality is good enough

error is too high

quality is good enough

Folie 46

Possible Changes

– Architecture of NN– size of a network– shortcut connection– partial connected layers– remove/add links– receptive areas

– Find the right parameter values– learning parameter– size of layers– using genetic algorithms

Folie 47

Memory Capacity - Experiment

– output-layer is a copy of the input-layer

– training set consisting of n random pattern

– error:– error = 0

network can store more than n patterns

– error >> 0 network can not store n patterns

– memory capacity:error > 0 and error = 0 for n-1 patterns and error >>0 for n+1 patterns

Folie 48

Summary

– Backpropagation is a Backpropagation of Error Algorithm– works like gradient descent– Activation Functions: Logistics, tanh– Meaning of Learning parameter

– Modifications– RPROP– Backprop Momentum– QuickProp

– Finding an appropriate Architecture:– Memory Size of a Network– Modifications in layer connection

– Applications

Folie 49

Binary Coding of nominal values I

– no order relation, n-values– n neurons, – each neuron represents one and only one

value:– example:

red, blue, yellow, white, black

1,0,0,0,0 0,1,0,0,0 0,0,1,0,0 ...– disadvantage: n neurons necessary,

but only one of them is activated lots of zeros in the input

Folie 50

Binary Coding of nominal values II

– no order-relation, n values– m neurons, of it k neurons switched on for one

single value– requirement: (m choose k) n

– example: red, blue, yellow, white, black

1,1,0,0 1,0,1,0 1,0,0,1 0,1,1,0 0,1,0,1 4 neuron, 2 of it switched on, (4 choose 2) > 5

– advantage:– fewer neurons– balanced ratio of 0 and 1

Folie 51

A1: Credit historyA2: debtA3: collateralA4: income

Example Credit Scoring

• network architecture depends on the coding of input and output• How can we code values like good, bad, 1, 2, 3, ...?

Folie 52

Folie 53

Supervised Learning – feed-forward networks

Idea An artificial Neuron – Neural Network Supervised Learning – feed-forward

networks Competitive Learning – Self-

Organising Map– Architecture– Learning– Visualisation

Applications

Folie 54

Self Organizing Maps (SOM)

A natural brain can organize itself Now we look at the position of a neuron and

its neighbourhood

Kohonen Feature Map two layer pattern associator

- Input layer is fully connected with map-layer- Neurons of the map layer are fully connected

to each other (virtually)

Folie 55

Clustering

- objective: All inputs of a class are mapped onto one and the same neuron

Input set Aoutput B

- Problem: classification in the input space is unknown- Network performs a clustering

Folie 56

Winner Neuron

Kohonen- Layer

Input-Layer

Winner Neuron

Folie 57

Learning in an SOM

1.Choose an input k randomly

2.Detect the neuron z which has the maximal activity

3.Adapt the weights in the neighbourhood of z: neuron i within a radius r of z.

4.Stop if a certain number of learning steps is finished

otherwise decrease learning rate and radius,go on with step 1.

Folie 58

A Map Neuron

– look at a single neuron (without feedback):

– Activation:

– Output: fout = Id

ijj ownet

jnetjactje

Folie 59

Centre of Activation

- Idea: highly activated neurons push down the activation of neurons in the neighbourhood

- Problem: Finding the centre of activation:

- Neuron j with a maximal net-input

- Neuron j, having a weight vector wj which is similar to the input vector (Euklidian Distance):

z: x - wz = minj x - wj

iz owow max

Folie 60

Changing Weights

- weights to neurons within a radius z will be increased: wj(t+1) = wj(t) + hjz(x(t)-wj(t)) , j z x-input wj(t+1) = wj(t) , otherwise

- Amount of influence depends on the distance to the centre of activation:

(amount of change wj?)

- Kohonen uses the function :

z determines the shape of the curve: z small high + sharp z high wide + flat

Folie 61

Changing weights

- Simulation by a Gauß-curve

- Changing Weights by a learning rate (t),going down to zero

- Weight change:: wj+1(t+1) = wj(t) + hjz(x(t)-wj(t)) , j z

wj+1(t+1) = wj(t) , otherwise

- Requirements: - Pattern input by random! z(t) and z(t) are monotone decreasing functions in t.

Mexican-Hat-Approach

-3 -2 -1

0 1 2 3

Folie 62

SOM Training

WmWm min

• find the winner neuron zfor an input pattern p(minimal Euclidian distance)

• adapt weights of connections• winner neuron -input neurons• neighbours – input neurons

Kohonen layer

input pattern mp

otherwisew

rzjdistifwmhww

ijijzij

),(,)(/

zjdist

Folie 63

A1: Credit HistoryA2: DebtsA3: CollateralA4: Income

• We do not look at the Classification

• SOM performs a Clustering

Folie 64

Credit Scoring

– good = {5,6,9,10,12}– average = {3, 8, 13}– bad = {1,2,4,7,11,14}

Folie 65

Credit Scoring

– Pascal tool box (1991)– 10x10 neurons– 32,000 training steps

Folie 66

Visualisation of a SOM

• Colour reflects Euclidian distance to input

NetDemo

TSPDemo

• Weights used as coordinates of a neuron

• Colour reflects cluster

ColorDemo

Folie 67

Example TSP

– Travelling Salesman Problem– A salesman has to visit certain cities and will

return to his home. Find an optimal route!– problem has exponential complexity: (n-1)!

routes

Experiment: Pascal Program, 1998 31/32 states in Mexico?

Folie 68

Nearest Neighbour: Example

– Some cities in Northern Germany:

– Initial city is Hamburg

KielRostock

Berlin

Hamburg

Hannover

Frankfurt

Schwerin

Exercise:• Put in the coordinates

of the capitals of all the 31 Mexican States + Mexico/City.

• Find a solution for the TSP using a SOM!

Folie 69

SOM solves TSP

inputKohonen layer

w1i= six

w2i= siy

Draw a neuron at position:

(x,y)=(w1i,w2i)X

Folie 70

SOM solves TSP

– Initialisation of weights:– weights to input (x,y) are calculated

so that all neurons form a circle– The initial circle will be expanded to a

round trip– Solutions for problems of several

hundreds of towns are possible– Solution may be not optimal!

Folie 71

Applications

– Data Mining - Clustering– Customer Data– Weblog– ...

– You have a lot of data, but no teaching data available – unsupervised learning– you have at least an idea about the

result– Can be applied as a first approach to get

some training data for supervised learning

Folie 72

Applications

Pattern recognition (text, numbers, faces):number plates,access at cash automata,

Similarities between molecules Checking the quality of a surface Control of autonomous vehicles Monitoring of credit card accounts Data Mining

Folie 73

Applications

Speech recognition Control of artificial limbs classification of galaxies Product orders (Supermarket) Forecast of energy consumption Stock value forecast

Folie 74

Application - Summary

Classification Clustering Forecast Pattern recognition

Learning by examples, generalization Recognition of not known structures in large

Folie 75

Application

– Data Mining:– Customer Data– Weblog

– Control of ...– Pattern Recognition

– Quality of surfaces– possible if you have training data ...

Folie 76

The End

wissenschaftliches arbeiten - auswertung folie 1 artificial neural networks uwe lämmel wismar...

Documents

oopm, ralf lämmel -...

badminton2allsports.pt/files/yonex_catalogue_team_badminton_2015.pdf ·...

neuronale netze - wettbewerbslernen folie 1...

dieterraschund - download.e-bookshelf.de · autoren...

1 neural networks - basics artificial neural networks -...

modulhandbuch - hs-wismar.de · 11t11tuupm 24 -...

performanz- und lasttests formale methoden fakultät für...

evolution-enabled application-programming interfaces ralf...

elektrochemisches fräsen - cim-wismar.decim-wismar.de/07 -...

choose'10: ralf laemmel - dealing confortably with the...

entwicklung eines simulators zur ... - cea-wismar.de ·...

prof. dr. ralf lämmel university of koblenz-landau faculty...

einführung in die programmierung folie 1 uwe lämmel...

komplexität von algorithmen -...

wiebke peters - campusonline1 typo3 workshop hochschule...

1 supervised learning uwe lämmel business school institute...

einführung in die programmierung – sammlungen folie 1 uwe...

um rekursion zu verstehen, muss man vor allem rekursion...

programmierung und -...

einführungskurs matlab & simulink - cea-wismar.de ·...