1. 2 machine learning يادگيري ماشين lecturer: a. rabiee [email protected]...
TRANSCRIPT
1
منابع و مراجع
Main Reference:- Mitchell, T. M. (1997). Machine learning. WCB.Other References:- Haykin, S. S. (2009). Neural networks and learning machines (Vol. 3). Upper Saddle River:
Pearson Education.- Mitchell, T. M. (1999). Machine learning and data mining. Communications of the
ACM, 42(11), 30-36.- Anderson, J. R. (1986). Machine learning: An artificial intelligence approach(Vol. 2). R. S.
Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.). Morgan Kaufmann.- Mitchell, M. (1998). An introduction to genetic algorithms. MIT press.- Witten, I. H., & Frank, E. (2011). Data Mining: Practical machine learning tools and
techniques. Morgan Kaufmann.- Hagan, M. T., Demuth, H. B., & Beale, M. H. (1996). Neural network design(pp. 2-14).
Boston: Pws Pub..- Kecman, V. (2001). Learning and soft computing: support vector machines, neural
networks, and fuzzy logic models. MIT press.- - ….
3
Course Outline
• Chapter 1: Introduction• Chapter 3: Decision tree learning• Chapter 4: Artificial Neural
Networks• Chapter 9: Genetic Algorithms• Chapter 13: Reinforcement
Learning
ارزشيابي درس
Final Exam: 50Mini Projects (2 to 4): 20Final Project + Presentation: 30Paper (optional): +15
4
Chapter 1:
Introduction to Machine Learning
Table of Contents
• Definition & Examples• Applications• Why ML?• ML Problems
Definition (Mitchell 1997)
• Machine Learning– Learn from past experiences– Improve the performances of
intelligent programs• Definition
– A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at the tasks improves with the experiences
Examples
• Text Classification (or spam classification)– Task T
• Assigning texts to a set of predefined categories
– Performance measure P• Precision of each category
– Training experiences E (Dataset)• A dataset of texts with their corresponding
categories
• How about Disease Diagnosis?• How about Chess Playing?
Two phases
• Two phases of a learning process:– Train– Test
Example: Classification of texts based on content
Text classifierNew text file class
Classified text files
Text file 1 trade
Text file 2 ship
… …
Training
Phase 1: train
Phase 2: test
Example:Heart disease diagnosis
Disease classifierNew patient’s
dataPresence or
absence
Database of medical records
Patient 1’s data Absence
Patient 2’s data Presence
… …
Training
Example: Chess Playing
Strategy of Searching and
Evaluating
New matrix representing the current
board
Best move
Games played:
Game 1’s move list Win
Game 2’s move list Lose
… …
Training
Machine Learning Problems
Clustering: Grouping similar instancesDimension Reduction: Image CompressionRegression: Tuning the angle of a robot arm
Application: Image Categorization (two phases)
Training LabelsTraining
Images
Classifier Training
Training
Image Features
Trained Classifier
Image Features
Testing
Test Image
Trained Classifier Outdoor
Prediction
Feature Extraction
Training LabelsTraining
Images
Classifier Training
Training
Image Features
Trained Classifier
Example: Boundary Detection
• Is this a boundary?
Training Algorithm
Training LabelsTraining
Images
Classifier Training
Training
Image Features
Trained Classifier
The main aim of this course
Classifier TrainingExample: A 2-class classifier
• Given some set of features with corresponding labels, learn a function to predict the labels from the features
• Example: Credit scoring
Discriminant (model): IF income > θ1 AND savings > θ2 THEN
low-risk ELSE high-risk
Different Learning Algorithms
• Decision Tree Learning• Neural networks • Naïve Bayes• Genetic Algorithm• K-nearest neighbor (clustering)• Reinforcement Learning• Support Vector Machine (SVM)• …
Note
The decision to use machine learning is more important than the choice of a particular learning method.
Why Machine Learning Is Possible?
• Mass Storage– More data available
• Higher Performance of Computer– Larger memory in handling the data– Greater computational power for
calculating and even online learning
Advantages
• Alleviate Knowledge Acquisition Bottleneck– Does not require knowledge engineers– Scalable in constructing knowledge
base
• Adaptive– Adaptive to the changing conditions– Easy in migrating to new domains
Success of Machine Learning
• Almost All the Learning Algorithms– Text classification (Dumais et al. 1998)– Gene or protein classification
optionally with feature engineering (Bhaskar et al. 2006)
• Reinforcement Learning– Backgammon (Tesauro 1995)
• Learning of Sequence Labeling– Speech recognition (Lee 1989)– Part-of-speech tagging (Church 1988)
Datasets
• UCI Repository: http://www.ics.uci.edu/~mlearn/MLRepository.html
• UCI KDD Archive: http://kdd.ics.uci.edu/summary.data.application.html
• Statlib: http://lib.stat.cmu.edu/
• Delve: http://www.cs.utoronto.ca/~delve/
• US government free data: data.gov
• US government free data (California): data.ca.gov
• …. for other states and the UK data.gov.uk, as well• Stock market softwares• Weather forecasting websites• Reuters: data set for text classification• ….
What I will Talk about
• Machine Learning Methods– Simple methods– Effective methods (state of the art)
• Method Details– Ideas – Assumptions– Intuitive interpretations
What I won’t Talk about
• Machine Learning Methods– Classical, but complex and not effective methods
(e.g., complex neural networks)– Methods not widely used
• Method Details– Theoretical justification– Theorem proving
What You will Learn
• Machine Learning Basics– Methods– Data– Assumptions– Ideas
• Others– Problem solving techniques– Extensive knowledge of modern techniques