chapter 6. classification and prediction - rizal setya · pdf filechapter 6. classification...

13
Chapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization model before receiving new tuples to classify Classification by decision tree induction Rule-based classification Classification by back propagation Support Vector Machines (SVM) Associative classification

Upload: lequynh

Post on 07-Feb-2018

236 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

Chapter 6. Classification and Prediction

Eager Learners: when given a set of training tuples,

will construct a generalization model before receiving

new tuples to classify

Classification by decision tree induction

Rule-based classification

Classification by back propagation

Support Vector Machines (SVM)

Associative classification

Page 2: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

Lazy vs. Eager Learning

Lazy vs. eager learning

Lazy learning (e.g., instance-based learning): Simply stores training data (or only minor processing) and waits until it is given a test tuple

Eager learning (the above discussed methods): Given a set of training set, constructs a classification model before receiving new (e.g., test) data to classify

Lazy: less time in training but more time in predicting

Page 3: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

Lazy Learner: Instance-Based Methods

Typical approaches

k-nearest neighbor approach

Instances represented as points in a Euclidean space.

Page 4: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

The k-Nearest Neighbor Algorithm

All instances correspond to points in the n-D space

The nearest neighbor are defined in terms of Euclidean distance, dist(X1, X2)

Target function could be discrete- or real- valued

For discrete-valued, k-NN returns the most common value among the k training examples nearest to xq

.

_+

_ xq

+

_ _

+

_

_

+

Page 5: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

The k-Nearest Neighbor Algorithm

k-NN for real-valued prediction for a given unknown tuple

Returns the mean values of the k nearest neighbors

Distance-weighted nearest neighbor algorithm

Weight the contribution of each of the k neighbors according to

their distance to the query xq

Give greater weight to closer neighbors

Robust to noisy data by averaging k-nearest neighbors

Page 6: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

The k-Nearest Neighbor Algorithm

How can I determine the value of k, the number of neighbors?

In general, the larger the number of training tuples is, the larger the value of k is

Nearest-neighbor classifiers can be extremely slow when classifying test tuples O(n)

By simple presorting and arranging the stored tuples into search tree, the number of comparisons can be reduced to O(logN)

Page 7: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

The k-Nearest Neighbor Algorithm

Example:

K=5

Page 8: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization
Page 9: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization
Page 10: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization
Page 11: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

Distance Measure

Page 12: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

Distance Measure Example

WINE CHERRY CHEWY

TANNINS

BEAUTY

WINE1 1 1 1

WINE2 0 0 1

Page 13: Chapter 6. Classification and Prediction - Rizal Setya · PDF fileChapter 6. Classification and Prediction Eager Learners: when given a set of training tuples, will construct a generalization

What is the prediction if k=1? K=3? K=5?

WINE blueberry blackberry CHERRY CHEWY

TANNINS

Score

Wine1 1 0 0 1 90+

Wine2 1 1 1 0 90+

Wine3 0 1 0 1 90+

Wine4 0 1 1 1 90-

Wine5 0 0 1 0 90-

Wine6 0 0 1 1 90-

Unknown 1 1 0 1 ?