Download - L1. State of the Art in Machine Learning
State of the Art in Machine Learning
Poul Petersen
BigML
2
What is ML?
“a field of study that gives computers the ability to learn without being explicitly
programmed”
Professor Arthur Samuel, 1959
•What “ability to learn” do computers have?
•What does “explicitly programmed” mean?
Square Feet
Price
Sacramento Real Estate Prices by sq_ft
3
Ability to Learn
4
Ability to Learn• Provide computer with examples of the relationship between
square footage X and price Y.
• The computer learns the equation of a line F() which fits the examples.
• The computer can now predict the price of any home knowing only the square footage: F( x ) = y
• This model is known as Linear Regression. There are other types of models.
• You may have noticed there was some points that did not fit. This is important!
5
Learning Problems (fit)
Under-fitting Over-fitting
• Model does not fit well enough
• Does not capture the underlying trend of the data
• Change algorithm or features
• Model fits too well does not “generalize”
• Captures the noise or outliers of the data
• Change algorithm or filter outliers
6
Learning Problems (missing)
• Missing values at training/prediction time
• Some algorithms can handle missing values, some no
• Missing data is sometimes important
• Replace missing values
• Predict missing values
7
Learning Problems (missing)
Missing@ Decision Trees KNN Logistic
RegressionNaive Bayes
Neural Networks
Training Yes No No Yes Yes*
Prediction Yes No No Yes No
8
Not Explicitly Programmed
Control System
“Customers go on vacation… We need a program that predicts if the light will come on when the control
system flips the switch.”
9
Not Explicitly Programmed
@iLoveRuby
Switch Light?
on TRUE
off FALSE
• @iLoveRuby reasoned the rules by experience
• Then programmed the rules explicitly
10
Not Explicitly ProgrammedSwitch Light?
on TRUE
off FALSE
• The ML System reasoned the rules from the data and created a Model
• Functionally the Model is the same as @iLoveRuby’s explicit model
@iLoveML
question
answer
ML System Model
11
Not Explicitly Programmed
• how long has the bulb has been in service
• reliability of brand
• rated power: higher power shorter life
• duty cycle
• room conditions: temperature, humidity
Goal is to predict if the bulb will come onbut the switch is not the important variable:
Even worse: None of these conditions is absolute.
12
Not Explicitly Programmedbrand power age duty temp humidity FAIL?koala 45 338 1 16 0.03 FALSEotter 15 140 1 27 0.27 FALSEkoala 15 315 1 19 0.37 TRUEotter 45 338 1 29 0.27 TRUEkoala 45 211 1 23 0.85 TRUEotter 15 328 1 17 0.56 FALSEkoala 15 318 2 22 0.45 TRUEkoala 15 273 1 27 0.18 FALSEkoala 45 102 1 21 0.48 FALSEkoala 15 110 2 15 0.99 TRUEotter 45 355 2 15 0.01 FALSEotter 15 69 1 24 0.70 FALSEkoala 15 69 1 24 0.70 FALSEkoala 15 337 2 27 0.83 TRUE
13
Not Explicitly Programmed
@iLoveRuby
• Multi-dimensional data is much harder to find rules
• Explicit program requires modification
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27 FALSE
koala 15 315 1 19 0.37 TRUE
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56 FALSE
koala 15 318 2 22 0.45 TRUE
koala 15 273 1 27 0.18 FALSE
koala 45 102 1 21 0.48 FALSE
koala 15 110 2 15 0.99 TRUE
otter 45 355 2 15 0.01 FALSE
otter 15 69 1 24 0.70 FALSE
koala 15 69 1 24 0.70 FALSE
koala 15 337 2 27 0.83 TRUE
@iLoveML ML System Model
14
Not Explicitly Programmed
• ML System easily re-trains on new data
question
answer
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27 FALSE
koala 15 315 1 19 0.37 TRUE
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56 FALSE
koala 15 318 2 22 0.45 TRUE
koala 15 273 1 27 0.18 FALSE
koala 45 102 1 21 0.48 FALSE
koala 15 110 2 15 0.99 TRUE
otter 45 355 2 15 0.01 FALSE
otter 15 69 1 24 0.70 FALSE
koala 15 69 1 24 0.70 FALSE
koala 15 337 2 27 0.83 TRUE
@iLoveML ML System Model
15
Terminology
input
prediction
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27 FALSE
koala 15 315 1 19 0.37 TRUE
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56 FALSE
koala 15 318 2 22 0.45 TRUE
koala 15 273 1 27 0.18 FALSE
koala 45 102 1 21 0.48 FALSE
koala 15 110 2 15 0.99 TRUE
otter 45 355 2 15 0.01 FALSE
otter 15 69 1 24 0.70 FALSE
koala 15 69 1 24 0.70 FALSE
koala 15 337 2 27 0.83 TRUE
datasource
training
“putting intoproduction”
16
Supervised Learning
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27 FALSE
koala 15 315 1 19 0.37 TRUE
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56 FALSE
koala 15 318 2 22 0.45 TRUE
features
instances
label
Labeled Data
17
Supervised Learning
animal state … proximity actiontiger hungry … close run
elephant happy … far take picture
Classification
animal state … proximity min_kmhtiger hungry … close 70hippo angry … far 10
Regression
label
animal state … proximity action1 action2tiger hungry … close run look untasty
elephant happy … far take picture call friends
Multi-Label Classification
18
Unsupervised Learning
date customer account auth class zip amount
Mon Bob 3421 pin clothes 46140 135
Tue Bob 3421 sign food 46140 401
Tue Alice 2456 pin food 12222 234
Wed Sally 6788 pin gas 26339 94
Wed Bob 3421 pin tech 21350 2459
Wed Bob 3421 pin gas 46140 83
The Sally 6788 sign food 26339 51
features
instances
Unlabeled Data
19
Unsupervised Learning
date customer account auth class zip amountMon Bob 3421 pin clothes 46140 135Tue Bob 3421 sign food 46140 401Tue Alice 2456 pin food 12222 234Wed Sally 6788 pin gas 26339 94Wed Bob 3421 pin tech 21350 2459Wed Bob 3421 pin gas 46140 83The Sally 6788 sign food 26339 51
Clustering
date customer account auth class zip amountMon Bob 3421 pin clothes 46140 135Tue Bob 3421 sign food 46140 401Tue Alice 2456 pin food 12222 234Wed Sally 6788 pin gas 26339 94Wed Bob 3421 pin tech 21350 2459Wed Bob 3421 pin gas 46140 83The Sally 6788 sign food 26339 51
Anomaly Detection
similar
unusual
20
Semi-Supervised Learning
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27
koala 15 315 1 19 0.37
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56
koala 15 318 2 22 0.45 TRUE
label
Labeled and Unlabeled Data
} infer
21
Data Types
numeric
1 2 3
1, 2.0, 3, -5.4 categoricaltrue, yes, red, mammal categoricalcategorical
A B C
DATE-TIME2013-09-25 10:02
DATE-TIME
YEAR
MONTH
DAY-OF-MONTH
YYYY-MM-DD
DAY-OF-WEEK
HOUR
MINUTE
YYYY-MM-DD
YYYY-MM-DD
M-T-W-T-F-S-D
HH:MM:SS
HH:MM:SS
2013
September
25
Wednesday
10
02
textBe not afraid of greatness: some are born great, some achieve greatness, and some have greatness thrust upon 'em.
text
“great”“afraid”“born”“some”
appears 2 timesappears 1 timeappears 1 timeappears 2 times
22
Text Analysis
Be not afraid of greatness: some are born great, some achieve greatness, and some have greatnessthrust upon 'em.
22
Text Analysis
Be not afraid of greatness: some are born great, some achieve greatness, and some have greatnessthrust upon 'em.
great: appears 4 times
23
Text Analysisgreat afraid born achieve
4 1 1 1
… … … …
Be not afraid of greatness: some are born great, some achieve greatness, and some have greatnessthrust upon ‘em.
Model
The token “great” does not occur
The token “afraid” occurs more than once
24
Topic Modeling
25
Brief History of ML
1950 1960 1970 1980 1990 2000 2010
PerceptronNeural
Networks
Ensembles
Support Vector Machines
Boosting
Inte
rpre
tabi
lity
Rosenblatt, 1957
Quinlan, 1979 (ID3),
Minsky, 1969
Vapnik, 1963 Corina & Vapnik, 1995
Schapire, 1989 (Boosting) Schapire, 1995 (Adaboost)
Breiman, 2001 (Random Forests)Breiman, 1994 (Bagging)
Deep LearningHinton, 2006Fukushima, 1989 (ANN)
Breiman, 1984 (CART)
2020
+
-
Decision Trees
26
Why ML Now?
• Decreasing cost of data
• Abundant computing power, especially cloud
• Machine Learning APIs
• Abundance of APIs + internet to combine easily
27
Composability
28
The Stages of a ML App
State the problem
Data Wrangling
Feature EngineeringLearning
Deploying
Predicting
Measuring Impact