1 supervised learning uwe lämmel business school institute of business informatics laemmel...

64
1 Supervised Learning Supervised Learning Uwe Lämmel Business School Institute of Business Informatics www.wi.hs-wismar.de/ ~laemmel [email protected] wismar.de

Post on 20-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

1 Supervised Learning

Supervised LearningUwe Lämmel

Business SchoolInstitute of

Business Informatics

www.wi.hs-wismar.de/~laemmel

[email protected]

Page 2: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

2 Supervised Learning

Neural Networks

– Idea

– Artificial Neuron & Network

– Supervised Learning

– Unsupervised Learning

– Data Mining – other Techniques

Page 3: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

3 Supervised Learning

Supervised Learning

Feed-Forward Networks– Perceptron – AdaLinE – LTU– Multi-Layer networks– Backpropagation Algorithm– Pattern recognition– Data preparation

Examples– Bank Customer– Customer Relationship

Page 4: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

4 Supervised Learning

Connections

– Feed-forward– Input layer– Hidden layer– Output layer

Hopfield network

– Feed-back / auto-associative – From (output) layer back to

previous (hidden/input) layer– All neurons fully connected to

each other

Page 5: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

5 Supervised Learning

Perceptron – Adaline – TLU

– One layer of trainable links only– Adaptive linear element– Threshold Linear Unit

– class of neural network of a special architecture:

...

Page 6: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

6 Supervised Learning

Papert, Minsky and Perceptron - History

"Once upon a time two daughter sciences were born to the new science of cybernetics. One sister was natural, with features inherited from the study of the brain, from the way nature does things.

The other was artificial, related from the beginning to the use of computers. …

But Snow White was not dead.

What Minsky and Papert had shown the world as proof was not the heart of the princess; it was the heart of a pig."

Seymour Papert, 1988

Page 7: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

7 Supervised Learning

Perception

Perceptionfirst step of recognition

becoming aware of something via the senses

mapping layer

picture

fixed 1-1- links

trainable, fully connected

output-layer

Page 8: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

8 Supervised Learning

Perceptron

– Input layer – binary input, passed trough, – no trainable links

– Propagation functionnetj = oiwij

– Activation function

oj = aj = 1 if netj j , 0 otherwise

A perceptron can learn all the functions, that can be represented, in a finite time .

(perceptron convergence theorem, F. Rosenblatt)

Page 9: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

9 Supervised Learning

Linear separable

Neuron j should be 0,iff both neurons 1 and 2 have the same value (o1=o2), otherwise 1:

netj = o1w1j + o2w2j

0 w1j + 0w2j < j

0 w1j + 1w2j j

1 w1j + 0w2j j

1 w1j + 1w2j < j

j

1 2

? w1j w2j

j

Page 10: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

10 Supervised Learning

Linearseparable

– netj = o1w1j + o2w2j

line in a 2-dim. space– line divides plane so,

that (0,1) and (1,0) are in different sub planes. – the network can not solve the problem.

– a perceptron can represent only some functions

a neural network representing the XOR-function needs hidden neurons

1

1 (1,1)

(0,0) o1

o2

o1*w1 +o2*w2=q

Page 11: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

11 Supervised Learning

Learning is easy

while input pattern do beginnext input patter calculate output for each j in OutputNeurons doif ojtj thenif oj=0 then {output=0, but 1 expected }for each i in InputNeurons do wij:=wij+oi else if oj=1 then {output=1, but 0 expected }for each i in InputNeurons do wij:=wij-oi ;

end repeat until desired behaviour

Page 12: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

12 Supervised Learning

Exercise

– Decoding– input: binary code of a digit– output - unary representation:

as many digits 1, as the digit represents: 5 : 1 1 1 1 1

– architecture:

Page 13: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

13 Supervised Learning

Exercise

– Decoding– input: Binary code of a digit– output: classification:

0~ 1st Neuron, 1~ 2nd Neuron, ... 5~ 6th Neuron, ...

– architecture:

Page 14: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

14 Supervised Learning

Exercises

1. Look at the EXCEL-file of the decoding problem

2. Implement (in PASCAL/Java) a 4-10-Perceptron which transforms a binary representation of a digit (0..9) into a decimal number. Implement the learning algorithm and train the network.

3. Which task can be learned faster?(Unary representation or classification)

Page 15: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

15 Supervised Learning

Exercises

5. Develop a perceptron for the recognition of digits 0..9. (pixel representation)input layer: 3x7-input neuronsUse the SNNS or JavaNNS

6. Can we recognize numbers greater than 9 as well?

7. Develop a perceptron for the recognition of capital letters. (input layer 5x7)

Page 16: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

16 Supervised Learning

multi-layer Perceptron

– several trainable layers– a two layer perceptron can classify convex

polygons – a three layer perceptron can classify any sets

Cancels the limits of a perceptron

multi layer perceptron = feed-forward network

= backpropagation network

Page 17: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

17 Supervised Learning

Multi-layer feed-forward network

Page 18: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

18 Supervised Learning

Feed-Forward Network

Page 19: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

19 Supervised Learning

Evaluation of the net output in a feed forward network

Ni

Nj

Nk

netj

netk

Oj=actj

Ok=act

k

Trai

ning

pa

ttern

p

Oi=pi

Input-Layer hidden Layer(s) Output Layer

Page 20: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

20 Supervised Learning

Backpropagation-Learning Algorithm

– supervised Learning

– error is a function of the weights wi :E(W) = E(w1,w2, ... , wn)

– We are looking for a minimal error– minimal error = hollow in the error surface– Backpropagation uses the gradient

for weight adaptation

Page 21: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

21 Supervised Learning

error curve

weight1weight2

Page 22: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

22 Supervised Learning

Problem

– error in output layer:– difference output – teaching output

– error in a hidden layer?

output teaching output

input layer

hidden layer

Page 23: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

23 Supervised Learning

Gradient descent

– Gradient:– Vector orthogonal to a

surface in direction of the strongest slope

– derivation of a function in a certain direction is the projection of the gradient in this direction

0,00

0,40

0,80

-1 -0,6 -0,2 0,2 0,6 1

example of an error curve of a weight wi

Page 24: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

24 Supervised Learning

Example: Newton-Approximation

– calculation of the root– f(x) = x²-5

– x = 2– x‘ = ½(x + 5/x) = 2.25– X“= ½(x‘ + 5/x‘) =

2.2361xx‘

f(x)= x²-a

tan = f‘(x) = 2x tan = f(x) / (x-x‘) x‘ =½(x + a/x)

Page 25: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

25 Supervised Learning

Backpropagation - Learning

– gradient-descent algorithm – supervised learning:

error signal used for weight adaptation– error signal:

– teaching – calculated output , if output neuron– weighted sum of error signals of successor

– weight adaptation: : Learning rate : error signal jiijij oww '

Page 26: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

26 Supervised Learning

Standard-Backpropagation Rule– gradient descent: derivation of a function

– logistic function:

f´act(netj) = fact(netj)(1- fact(netj)) = oj(1-oj)

– the error signal j is therefore:

neuronoutputisjif)()1(

neuronhidden isjif)1(

jjjj

kjkkjj

j

otoo

woo

jiijij oww '

xLogistic exf

1

1)(

Page 27: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

27 Supervised Learning

Backpropagation

– Examples:– XOR (Excel)– Bank Customer

Page 28: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

28 Supervised Learning

Backpropagation - Problems

B CA

Page 29: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

29 Supervised Learning

Backpropagation-Problems

– A: flat plateau – weight adaptation is slow– finding a minimum takes a lot of time

– B: Oscillation in a narrow gorge– it jumps from one side to the other and back

– C: leaving a minimum – if the modification in one training step is to

high, the minimum can be lost

Page 30: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

30 Supervised Learning

Solutions: looking at the values

– change the parameter of the logistic function in order to get other values

– Modification of weights depends on the output:if oi=0 no modification will take place

– If we use binary input we probably have a lot of zero-values: change [0,1] into [-½ , ½] or [-1,1]

– use another activation function, eg. tanh and use [-1..1] values

Page 31: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

31 Supervised Learning

Solution: Quickprop

assumption: error curve is a square function calculate the vertex of the curve

)1()()1(

)()(

tw

tStS

tStw ijij

)()(

tw

EtS

ij

-2 2 6

slope of the error curve:

Page 32: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

32 Supervised Learning

Resilient Propagation (RPROP)

– sign and size of the weight modification are calculated separately: bij(t) – size of modification

bij(t-1) + if S(t-1)S(t) > 0bij(t) = bij(t-1) - if S(t-1)S(t) < 0

bij(t-1) otherwise

+>1 : both ascents are equal „big“ step0<-<1 : ascents are different „smaller“ step

-bij(t) if S(t-1)>0 S(t) > 0wij(t) = bij(t) íf S(t-1)<0 S(t) < 0

-wij(t-1) if S(t-1)S(t) < 0 (*) -sgn(S(t))bij(t) otherwise

(*) S(t) is set to 0, S(t):=0 ; at time (t+1) the 4th case will be applied.

Page 33: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

33 Supervised Learning

Limits of the Learning Algorithm

– it is not a model for biological learning

– no teaching output in natural learning

– no feedbacks in a natural neural network (at least nobody has discovered yet)

– training of an ANN is rather time consuming

Page 34: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

34 Supervised Learning

Exercise - JavaNNS

– Implement a feed forward network containing of 2 input neurons, 2 hidden neurons and one output neuron. Train the network so that it simulates the XOR-function.

– Implement a 4-2-4-network, which works like the identity function. (Encoder-Decoder-Network). Try other versions: 4-3-4, 8-4-8, ...What can you say about the training effort?

Page 35: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

35 Supervised Learning

Pattern Recognition

Eingabeschicht 1. verdeckte 2. verdeckte Ausgabe-Schicht schicht schicht

input layer output layer

2. hidden layer

1. hidden layer

Page 36: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

36 Supervised Learning

Example: Pattern Recognition

JavaNNS example: Font

Page 37: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

37 Supervised Learning

„font“ Example

– input = 24x24 pixel-array

– output layer: 75 neurons, one neuron for each character:– digits– letters (lower case, capital) – separators and operator characters

– two hidden layer of 4x6 neurons each

– all neuron of a row of the input layer are linked to one neuron of the first hidden layer

– all neuron of a column of the input layer are linked to one neuron of the second hidden layer.

Page 38: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

38 Supervised Learning

Exercise– load the network “font_untrained”– train the network, use various learning

algorithms:

(look at the SNNS documentation for the parameters and their meaning)

– Backpropagation =2.0– Backpropagation =0.8 mu=0.6 c=0.1

with momentum– Quickprop =0.1 mg=2.0

n=0.0001 – Rprop =0.6

– use various values for learning parameter, momentum, and noise:– learning parameter 0.2 0.3 0.5 1.0– Momentum 0.9 0.7 0.5 0.0– noise 0.0 0.1 0.2

Page 39: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

39 Supervised Learning

A1: Credit historyA2: debtA3: collateralA4: income

Example: Bank Customer

• network architecture depends on the coding of input and output• How can we code values like good, bad, 1, 2, 3, ...?

Page 40: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

40 Supervised Learning

Data Pre-processing

– objectives– prospects of better

results– adaptation to algorithms– data reduction – trouble shooting

– methods– selection and

integration– completion– transformation

– normalization– coding– filter

Page 41: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

41 Supervised Learning

Selection and Integration

– unification of data (different origins)– selection of attributes/features– reduction

– omit obviously non-relevant data – all values are equal – key values– meaning not relevant– data protection

Page 42: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

42 Supervised Learning

Completion / Cleaning

– Missing values – ignore / omit attribute– add values

– manual– global constant („missing

value“)– average– highly probable value

– remove data set

– noised data– inconsistent data

Page 43: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

43 Supervised Learning

Transformation

– Normalization– Coding– Filter

Page 44: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

44 Supervised Learning

Normalization of values

– Normalization – equally distributed– in the range [0,1]

– e.g. for the logistic functionact = (x-minValue) / (maxValue - minValue)

– in the range [-1,+1]– e.g. for activation function tanh

act = (x-minValue) / (maxValue - minValue)*2-1

– logarithmic normalization– act = (ln(x) - ln(minValue)) / (ln(maxValue)-

ln(minValue))

Page 45: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

45 Supervised Learning

Binary Coding of nominal values I

– no order relation, n-values– n neurons, – each neuron represents one and only one value:

– example: red, blue, yellow, white, black

1,0,0,0,0 0,1,0,0,0 0,0,1,0,0 ...– disadvantage:

n neurons necessary lots of zeros in the input

Page 46: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

46 Supervised Learning

Bank Customercredit

history debt incomecollateral

Are these customers good ones?1: bad high adequate 32: good low adequate 2

Page 47: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

47 Supervised Learning

The Problem: A Mailing Action

– mailing action of a company: – special offer– estimated annual income per customer:

– given:– 10,000 sets of customer data

containing 1,000 cancellers (training)

– problem: – test set containing 10,000 customer data– Who will cancel ? Whom to send an offer?

customerwillcancel

willnot cancel

gets an offer 43.80€ 66.30€

gets no offer 0.00€ 72.00€

Data Mining Cup 2002

Page 48: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

48 Supervised Learning

Mailing Action – Aim?

– no mailing action:– 9,000 x 72.00 = 648,000

– everybody gets an offer:– 1,000 x 43.80 + 9,000 x 66.30 = 640,500

– maximum (100% correct classification):– 1,000 x 43.80 + 9,000 x 72.00 = 691,800

customerwillcancel

willnot cancel

gets an offer

43.80€

66.30€

gets no offer

0.00€ 72.00€

Page 49: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

49 Supervised Learning

Goal Function: Lift

basis: no mailing action: 9,000 · 72.00goal = extra income:liftM = 43.8 · cM + 66.30 · nkM – 72.00· nkM

customerwillcancel

willnot cancel

gets an offer

43.80€

66.30€

gets no offer

0.00€ 72.00€

Page 50: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

50 Supervised Learning

Dataresults>

<important

^missing values^

----- 32 input data ------

Page 51: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

51 Supervised Learning

Feed Forward Network – What to do?

– train the net with training set (10,000)– test the net using the test set ( another 10,000)

– classify all 10,000 customer into canceller or loyal

– evaluate the additional income

Page 52: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

52 Supervised Learning

Results

data mining cup 2002

neural network project 2004

– gain: – additional income by the mailing action

if target group was chosen according analysis

Page 53: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

53 Supervised Learning

Review Students Project

– copy of the data mining cup

– real data– known results– contest

motivation

enthusiasm

better results

• wishes– engineering approach data

mining– real data for teaching purposes

Page 54: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

54 Supervised Learning

Data Mining Cup 2007

– started on April 10. – check-out couponing – Who will get a rebate coupon?– 50,000 data sets for training

Page 55: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

55 Supervised Learning

Data

Page 56: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

56 Supervised Learning

DMC2007

– ~75% output = N(o)– e.g. classification has to > 75%!!

– first experiments: no success – deadline: May 31st

Page 57: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

57 Supervised Learning

Optimization of Neural Networks

objectives

– good results in an application: better generalisation (improve correctness)

– faster processing of patterns(improve efficiency)

– good presentation of the results(improve comprehension)

Page 58: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

58 Supervised Learning

Ability to generalize

– network too large:– all training patterns are learned from

memory– no ability to generalize

– network too small:– rules of pattern recognition can not be

learned(simple example: Perceptron and XOR)

• a trained net can classify data (out of the same class as the learning data)that it has never seen before– aim of every ANN development

Page 59: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

59 Supervised Learning

Development of an NN-application

calculate network output compare to

teaching output

use Test set data

evaluate output

compare to teaching output

change parameters

modify weights

input of training pattern

build a network architecture

quality is good enough

error is too high

error is too high

quality is good enough

Page 60: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

60 Supervised Learning

Possible Changes

– Architecture of NN– size of a network– shortcut connection– partial connected layers– remove/add links– receptive areas

– Find the right parameter values– learning parameter– size of layers– using genetic algorithms

Page 61: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

61 Supervised Learning

Memory Capacity

– figure out the memory capacity– change output-layer: output-layer input-layer

– train the network with an increasing number of random patterns:

– error becomes small: network stores all patterns– error remains: network can not store all

patterns– in between: memory capacity

Number of patternsa network can store without generalisation

Page 62: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

62 Supervised Learning

Memory Capacity - Experiment

– output-layer is a copy of the input-layer

– training set consisting of n random pattern

– error:– error = 0

network can store more than n patterns

– error >> 0 network can not store n patterns

– memory capacity:error > 0 and error = 0 for n-1 patterns and error >>0 for n+1 patterns

Page 63: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

63 Supervised Learning

Layers Not fully Connected

– partial connected (e.g. 75%)– remove links, if weight has been nearby 0 for

several training steps– build new connections (by chance)

new

removed

connections:

remaining

Page 64: 1 Supervised Learning Uwe Lämmel Business School Institute of Business Informatics laemmel U.laemmel@wi.hs-wismar.de

64 Supervised Learning

Summary

– Feed-forward network– Perceptron (has limits)– Learning is Math

– Backpropagation is a Backpropagation of Error Algorithm

– works like gradient descent– Activation Functions: Logistics, tanh

– Application in Data Mining, Pattern Recognition– data preparation is important

– Finding an appropriate Architecture