classification: perceptron -...

21
Classification: Perceptron Industrial AI Lab. Prof. Seungchul Lee

Upload: vuhuong

Post on 24-Jul-2019

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Classification: Perceptron

Industrial AI Lab.

Prof. Seungchul Lee

Page 2: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Classification

• Where 𝑦 is a discrete value– Develop the classification algorithm to determine which class a new input should fall into

• Start with a binary class problem– Later look at multiclass classification problem, although this is just an extension of binary

classification

• We could use linear regression– Then, threshold the classifier output (i.e. anything over some value is yes, else no)

– linear regression with thresholding seems to work

2

Page 3: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Classification

• We will learn– Perceptron

– Support vector machine (SVM)

– Logistic regression

• To find

a classification boundary

3

Page 4: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron

• For input 𝑥 =

𝑥1⋮𝑥𝑑

'attributes of a customer’

• Weights 𝜔 =

𝜔1

⋮𝜔𝑑

4

Page 5: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron

• Introduce an artificial coordinate 𝑥0 = 1 :

• In a vector form, the perceptron implements

5

Page 6: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron

• Works for linearly separable data

• Hyperplane– Separates a D-dimensional space into two half-spaces

– Defined by an outward pointing normal vector

– 𝜔 is orthogonal to any vector lying on the hyperplane

– Assume the hyperplane passes through origin, 𝜔𝑇𝑥 = 0 with 𝑥0 = 1

6

Page 7: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Sign

• Sign with respect to a line

7

Page 8: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

How to Find 𝝎

• All data in class 1 (𝑦 = 1)– 𝑔 𝑥 > 0

• All data in class 0 (𝑦 = −1)– 𝑔 𝑥 < 0

8

Page 9: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron Algorithm

• The perceptron implements

• Given the training set

1) pick a misclassified point

2) and update the weight vector

9

Page 10: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron Algorithm

• Why perceptron updates work ?

• Let's look at a misclassified positive example 𝑦𝑛 = +1

– Perceptron (wrongly) thinks 𝜔𝑜𝑙𝑑𝑇 𝑥𝑛 < 0

– Updates would be

– Thus 𝜔𝑛𝑒𝑤𝑇 𝑥𝑛 is less negative than 𝜔𝑜𝑙𝑑

𝑇 𝑥𝑛

10

Page 11: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Iterations of Perceptron

1. Randomly assign 𝜔

2. One iteration of the PLA (perceptron learning algorithm)

where (𝑥, 𝑦) is a misclassified training point

3. At iteration 𝑖 = 1, 2, 3,⋯ , pick a misclassified point from

4. And run a PLA iteration on it

5. That's it!

11

Page 12: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Diagram of Perceptron

12

Page 13: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron Loss Function

• Loss = 0 on examples where perceptron is correct,

i.e., 𝑦𝑛 ∙ 𝜔𝑇𝑥𝑛 > 0

• Loss > 0 on examples where perceptron is misclassified,

i.e., 𝑦𝑛 ∙ 𝜔𝑇𝑥𝑛 < 0

• Note:

– sign 𝜔𝑇𝑥𝑛 ≠ 𝑦𝑛 is equivalent to 𝑦𝑛 ∙ 𝜔𝑇𝑥𝑛 < 0

13

Page 14: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron Algorithm in Python

14

Page 15: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron Algorithm in Python

• Unknown parameters 𝜔

15

Page 16: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron Algorithm in Python

16

where 𝑥, 𝑦 is a misclassified training point

Page 17: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron Algorithm in Python

17

Page 18: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron Algorithm in Python

18

Page 19: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Perceptron Algorithm in Python

19

Page 20: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

Scikit-Learn for Perceptron

20

Page 21: Classification: Perceptron - i-systems.github.ioi-systems.github.io/.../iAI/ML/topics/05_Classification/09_Classification_Perceptron.pdf · •Perceptron finds one of the many possible

The Best Hyperplane Separator?

• Perceptron finds one of the many possible hyperplanes separating the data if one exists

• Of the many possible choices, which one is the best?

• Utilize distance information

• Intuitively we want the hyperplane having the maximum margin

• Large margin leads to good generalization on the test data

– we will see this formally when we discuss Support Vector Machine (SVM)

• Utilize distance information from all data samples

– We will see this formally when we discuss the logistic regression

• Perceptron will be shown to be a basic unit for neural networks and deep learning later

21