1. 2 machine learning يادگيري ماشين lecturer: a. rabiee [email protected]...

2

Machine Learning

يادگيري ماشين

Lecturer: A. Rabiee [email protected]

منابع و مراجع

Main Reference:- Mitchell, T. M. (1997). Machine learning. WCB.Other References:- Haykin, S. S. (2009). Neural networks and learning machines (Vol. 3). Upper Saddle River:

Pearson Education.- Mitchell, T. M. (1999). Machine learning and data mining. Communications of the

ACM, 42(11), 30-36.- Anderson, J. R. (1986). Machine learning: An artificial intelligence approach(Vol. 2). R. S.

Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.). Morgan Kaufmann.- Mitchell, M. (1998). An introduction to genetic algorithms. MIT press.- Witten, I. H., & Frank, E. (2011). Data Mining: Practical machine learning tools and

techniques. Morgan Kaufmann.- Hagan, M. T., Demuth, H. B., & Beale, M. H. (1996). Neural network design(pp. 2-14).

Boston: Pws Pub..- Kecman, V. (2001). Learning and soft computing: support vector machines, neural

networks, and fuzzy logic models. MIT press.- - ….

3

Course Outline

• Chapter 1: Introduction• Chapter 3: Decision tree learning• Chapter 4: Artificial Neural

Networks• Chapter 9: Genetic Algorithms• Chapter 13: Reinforcement

Learning

ارزشيابي درس

Final Exam: 50Mini Projects (2 to 4): 20Final Project + Presentation: 30Paper (optional): +15

4

Chapter 1:

Introduction to Machine Learning

Table of Contents

• Definition & Examples• Applications• Why ML?• ML Problems

Definition (Mitchell 1997)

• Machine Learning– Learn from past experiences– Improve the performances of

intelligent programs• Definition

– A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at the tasks improves with the experiences

Examples

• Text Classification (or spam classification)– Task T

• Assigning texts to a set of predefined categories

– Performance measure P• Precision of each category

– Training experiences E (Dataset)• A dataset of texts with their corresponding

categories

• How about Disease Diagnosis?• How about Chess Playing?

Two phases

• Two phases of a learning process:– Train– Test

Example: Classification of texts based on content

Text classifierNew text file class

Classified text files

Text file 1 trade

Text file 2 ship

… …

Training

Phase 1: train

Phase 2: test

Example:Heart disease diagnosis

Disease classifierNew patient’s

dataPresence or

absence

Database of medical records

Patient 1’s data Absence

Patient 2’s data Presence

… …

Training

Example: Chess Playing

Strategy of Searching and

Evaluating

New matrix representing the current

board

Best move

Games played:

Game 1’s move list Win

Game 2’s move list Lose

… …

Training

Machine Learning Problems

Clustering: Grouping similar instancesDimension Reduction: Image CompressionRegression: Tuning the angle of a robot arm

Application: Image Categorization (two phases)

Training LabelsTraining

Images

Classifier Training

Training

Image Features

Trained Classifier

Image Features

Testing

Test Image

Trained Classifier Outdoor

Prediction

Feature Extraction


Images

Classifier Training

Training

Image Features

Trained Classifier

Example: Boundary Detection

• Is this a boundary?

Training Algorithm


Images

Classifier Training

Training

Image Features

Trained Classifier

The main aim of this course

Classifier TrainingExample: A 2-class classifier

• Given some set of features with corresponding labels, learn a function to predict the labels from the features

• Example: Credit scoring

Discriminant (model): IF income > θ1 AND savings > θ2 THEN

low-risk ELSE high-risk

Different Learning Algorithms

• Decision Tree Learning• Neural networks • Naïve Bayes• Genetic Algorithm• K-nearest neighbor (clustering)• Reinforcement Learning• Support Vector Machine (SVM)• …

Note

The decision to use machine learning is more important than the choice of a particular learning method.

Why Machine Learning Is Possible?

• Mass Storage– More data available

• Higher Performance of Computer– Larger memory in handling the data– Greater computational power for

calculating and even online learning

Advantages

• Alleviate Knowledge Acquisition Bottleneck– Does not require knowledge engineers– Scalable in constructing knowledge

base

• Adaptive– Adaptive to the changing conditions– Easy in migrating to new domains

Success of Machine Learning

• Almost All the Learning Algorithms– Text classification (Dumais et al. 1998)– Gene or protein classification

optionally with feature engineering (Bhaskar et al. 2006)

• Reinforcement Learning– Backgammon (Tesauro 1995)

• Learning of Sequence Labeling– Speech recognition (Lee 1989)– Part-of-speech tagging (Church 1988)

Datasets

• UCI Repository: http://www.ics.uci.edu/~mlearn/MLRepository.html

• UCI KDD Archive: http://kdd.ics.uci.edu/summary.data.application.html

• Statlib: http://lib.stat.cmu.edu/

• Delve: http://www.cs.utoronto.ca/~delve/

• US government free data: data.gov

• US government free data (California): data.ca.gov

• …. for other states and the UK data.gov.uk, as well• Stock market softwares• Weather forecasting websites• Reuters: data set for text classification• ….

http://www.ics.uci.edu/~mlearn/MLRepository.html

http://kdd.ics.uci.edu/summary.data.application.html

http://lib.stat.cmu.edu/

http://www.cs.utoronto.ca/~delve/

http://www.cs.utoronto.ca/~delve/

What I will Talk about

• Machine Learning Methods– Simple methods– Effective methods (state of the art)

• Method Details– Ideas – Assumptions– Intuitive interpretations

What I won’t Talk about

• Machine Learning Methods– Classical, but complex and not effective methods

(e.g., complex neural networks)– Methods not widely used

• Method Details– Theoretical justification– Theorem proving

What You will Learn

• Machine Learning Basics– Methods– Data– Assumptions– Ideas

• Others– Problem solving techniques– Extensive knowledge of modern techniques

1. 2 machine learning يادگيري ماشين lecturer: a. rabiee [email protected]...

Documents