lecture 2: introduction to machine learning. machine learning definition field of study that gives...
TRANSCRIPT
Lecture 2: Introduction to Machine Learning
Machine Learning Definition
• Field of study that gives computer the ability to learn without being explicitly programmed (Arthur Samuel, 1956)
• Study of algorithms that improve their performance P at some task T with experience E (Tom Mitchell, 1998)
Well defined learning task: <P, T, E>
T: Play checkersP: % of games wonE: Playing against self
Well Defined Learning Task
• Handwriting Recognition– Task T: recognizing and classifying handwritten words within
images– Performance P: percent of words correctly classified– Training experience E: a database of written words with given
classification
Question
• Suppose your email program watches which email you do and do not mark as spam and based on that learn how to better filter spam. What is the task in this setting– Classifying emails as spam or not spam– The number of emails correctly classifying as spam/not spam– Labelling emails as spam/ not spam– Non of above: This is not a machine learning problem
Machine Learning Algorithms
• Supervised Learning Algorithms• Unsupervised Learning Algorithms
Supervised Learning
• Right answers are given for inputs• Regression refers to predicting continuous
valued output (e.g. price)
Supervised Learning
• Classification refers to predict discrete valued output (e.g. 0 or 1)
Supervised Learning
• More sophisticated features are:– Uniformity of cell size– Uniformity of cell shape, etc
Question
Suppose you are running a company and want to develop a learning algorithm to address each of two problems:
• Problem 1: you have large inventory of identical items. You want to predict how many of items will sell over next 3 months.
• Problem 2: you would like your program to examine individual customer accounts and for each account decide if it has been hacked or not.
• Should you treat these as classification or regression problem ?– Treat both as classification problem – Treat problem 1 as classification and 2 as regression problem– Treat both as regression problem– Treat 1 as regression and 2 as classification problems
Unsupervised Learning
Unsupervised Learning Application
Unsupervised Learning Application
Unsupervised Learning Application
Figure: DNA microarray data of individuals
Unsupervised Learning Application
0 50 100 150 200 250 300 350 400 4500
50
100
150
200
250
Windows [#]
Aver
age
powe
r con
sum
ptio
nn[W
]
1000
1500
2000
2500
3000
Fridge
Fridge and computer
Fridge, computer and dishwasher
Window size = 2 minutes
Unsupervised Learning Application
0 50 100 150 200 250 300 350 400 4500
50
100
150
200
250
Windows [#]
Aver
age
powe
r con
sum
ptio
nn[W
]
1000
1500
2000
2500
3000 State: S1State: S2State: S3State: S4State: S5State: S6State: S7
Fridge
Fridge and computer
Fridge, computer and dishwasher
0 10 20 30 40 50 60 70 800
20
40
60
80
100
120
Windows [#]
Aver
age
powe
r co
nsum
ptio
n [W
]
State: S1
State: S3
State: S4
State sequence of fridgeS1 S4 S3
120 130 140 150 160 170 180 190 200 2100
20406080
100120140160
Windows [#]
Aver
age
powe
r co
nsum
ptio
n [W
]
State: S2State:S5
State sequence of fridge and computerS2 S5 S2 S5
310 320 330 340 350 360 370 380 3900
500
1000
1500
2000
2500
3000
Windows [#]
Aver
age
powe
r co
nsum
ptio
n [W
]
State: S2State: S5State: S6State: S7
State sequence of diswasherS6 S7 S6 S7
Window size = 2 minutes
Unsupervised Learning Applications
Unsupervised Learning Application: Cocktail Party Problem
Unsupervised Learning Application: Cocktail Party Algorithm
Question
• Of following examples, which one you address using unsupervised learning algorithm?– Given email labelled as spam/not spam, learn a spam
filter– Given a set of news articles on the web, group them
into set of articles about the same story– Given a database of customer data, automatically
discover market segments and group customer into different market segments
– Given a database of patients diagnosed as either having diabetes or not, learn to classify a new patients as either having a diabetes or not.
Ungraded Assignment
• Install Octave – an open source software or• Practice with:
– Elementary operation: add, subtract, multiplication, power, divide, etc
– Conditional operation: equal, not equal, greater, greater and equal to, etc
– Logical operations: AND, OR, XOR, etc – Variable assignment– Vectors and matrices: defining vectors and
matrices, ones, zeros, rand, eye– doc and help comand