1 data mining lecture 8: artificial neural networks
Post on 30-Dec-2015
217 Views
Preview:
TRANSCRIPT
8
Conventional computes vs. Neural nets
In a von Neumann machine:
-fetch-decode-execute-…
In a Neural Net:
-different style of processing (signal processor)-massively parallel processor-information is stored in a set of weights (memory)-robust to noise and hardware failure-…
CPU
Memory
Instructions& data
data
9
Biological information
انسانآماري از بدن
وات20 رفي در مغز تقريبا" ثابت و حدودصمتوسط توان م وزن کل انسان است2مغز حدود % (kg1.3 ) اکسيژن مصرف مي کند20ولي حدود %ونها است رمغز يک صفحه بزرگ از ن (Cerebral cortex) تا 2 با قطر
CM2 22000 ميليمتر و مساحت3نرون 1011 با حدود #~(of stars in the galaxy و هر نرون )103
رحجم ده تا ده و خم شدت فشرد اتصال. اين صفحه بش104تا ديرگ جمجمه جاي دودمح
نورونها قديمي ترين و طوالني ترين سلولها در بدن هستند . مانندبقيه سلولها مرده و جايگزين مي شوند ولي بعضي هم اصال"
جايگزين نمي شوند . بعضي نورونها خيلي طوالني هستند تا چند فوت )ا متر( از بخش
محرک تا طناب نخاعي
11
Artificial Neural Networks (ANN)
X1 X2 X3 Y1 0 0 01 0 1 11 1 0 11 1 1 10 0 1 00 1 0 00 1 1 10 0 0 0
X1
X2
X3
Y
Black box
Output
Input
Output Y is 1 if at least two of the three inputs are equal to 1.
12
Artificial Neural Networks (ANN)
X1 X2 X3 Y1 0 0 01 0 1 11 1 0 11 1 1 10 0 1 00 1 0 00 1 1 10 0 0 0
X1
X2
X3
Y
Black box
0.3
0.3
0.3 t=0.4
Outputnode
Inputnodes
otherwise0
trueis if1)( where
)04.03.03.03.0( 321
zzI
XXXIY
13
Artificial Neural Networks (ANN)
Model is an assembly of inter-connected nodes and weighted links
Output node sums up each of its input value according to the weights of its links
Compare output node against some threshold t
X1
X2
X3
Y
Black box
w1
t
Outputnode
Inputnodes
w2
w3
)( tXwIYi
ii Perceptron Model
)( tXwsignYi
ii
or
14
Definition
the basic model : an artificial neuron
Warren McCulloch & Walter Pitts (1943)
w1w2
w3
wnH(wtx - b)
Σ
weights threshold
otherwise
bxwt
0
01
activation
wtx
b
b
w0
1x1
.
..
x2
x3
xn
16
Task
learn a binary classification f:ℝn{0,1} given examples )x,y( in ℝnx{0,1},
positive/negative examples evaluation: mean number of
misclassifications on a test set
17
Linear Classification
The equation below describes a hyperplane in the input space. This hyperplane is used to separate the two classes C1 and C2
0 bxwm
1iii
x2
C1
C2
x1
decisionboundary
w1x1 + w2x2 + b = 0
decisionregion for C1
w1x1 + w2x2 + b > 0
w1x1 + w2x2 + b <= 0
decisionregion for C2
18
Artificial Neuron
Using an activation function and a threshold, the neuron can implement a simple logic function:
Example: AND function
x1 x2 y
0 0 0
0 1 01 0 01 1 1
X1
X2
y
1
1
Threshold = 1.5
19
Artificial Neuron
Example 2: OR function
x1 x2 y
0 0 0
0 1 11 0 11 1 1
X1
X2
y
2
2
Threshold = 1.5
20
Perceptron
Rosenblatt 1962– Adopted from visual perception in human
– Pattern classification
w1w2
w3
wn...
Σθ
B
21
Minsky & Papert (1969) offered solution to XOR problem by combining perceptron unit responses using a second layer of units
1
2
3
Solution to XOR problem
22
XOR problem
x1
x2
-1
+1
+1+1
+1-1
-1
-1
0.1
In this graph of the XOR, input pairs giving output equal to 1 and -1 are depicted with green and red points. These two classes cannot be separated using a line. We have to use two lines. The following NN with two hidden nodes realizes this non-linear separation, where each hidden node is a perceptron and describes one of the two blue lines.
This NN uses the sign activation function. The two green arrows indicate the directions of the weight vectors of the two hidden nodes, (1,-1) and (-1,1). They indicate the regions where the network output will be 1. The output node is used to combine the outputs of the two hidden nodes.
23
Types of decision regions
022110 xwxww
022110 xwxwwx1
1
x2 w2
w1
w0
Convexregion
L1L2
L3L4 -3.5
Networkwith a singlenode
One-hidden layer network that realizes the convex region: eachhidden node realizes one of the lines bounding the convex region
P1P2
P3
1
1
1
1
1
x1
x2
1
-0.5
two-hidden layer network that realizes the union of three convex regions: each box represents a onehidden layer network realizing one convex region
1
1
1
1
x1
x2
1
25
Algorithm for learning ANN
Initialize the weights (w0, w1, …, wk)
Adjust the weights in such a way that the output of ANN is consistent with class labels of training examples– Objective function:
– Find the weights wi’s that minimize the above objective function
e.g., backpropagation algorithm (see lecture notes)
2),( i
iii XwfYE
26
Training: Backprop algorithm
The BP algorithm searches for weight values that minimize the total error of the network over the set of training set.
• BP consists of the repeated application of the following two passes:– Forward pass: in this step the network is activated on one
example and the error of (each neuron of) the output layer is computed.
– Backward pass: in this step the network error is used for updating the weights (credit assignment problem). Starting at the output layer, the error is propagated backwards through the network, layer by layer.
top related