![Page 1: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/1.jpg)
Slide 1dato, og ”Enhedens
Support Vector Machines
Joachim Mathiesen, Niels Bohr Institute
![Page 2: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/2.jpg)
Slide 2
(Over)simplified history
1960-1970s
Predominantly linear decision
boundaries/classifiers
1980s
Boom in neural networks and decision
trees
1990-2000s
Kernel machines/methods
outperformed neural networks
2010s
Revival of neural networks, and boosted
decision trees.
![Page 3: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/3.jpg)
Slide 3
Classification
![Page 4: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/4.jpg)
Slide 4
Generalized Linear Model (Logistic Regression) 1st Order Terms
![Page 5: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/5.jpg)
Slide 5
Generalized Linear Model (Logistic Regression) 4th Order Terms
![Page 6: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/6.jpg)
Slide 6
Random Forest
![Page 7: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/7.jpg)
Slide 7
Support Vector Machines
![Page 8: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/8.jpg)
Slide 8
Support Vector Machines
Efficient separation of non-linear regions based on kernel methods – we only have to know the dot product between data points.
No problems with convergence and no trapping in local minima – “simple” quadratic optimization
![Page 9: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/9.jpg)
Slide 9
Classification SVM
![Page 10: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/10.jpg)
Slide 10
“A complex pattern-classification problem, cast
in a high-dimensional space nonlinearly, is more
likely to be linearly separable than in a low-
dimensional space, provided that the space is
not densely populated.”
— Cover, T.M., Geometrical and Statistical
properties of systems of linear inequalities with
applications in pattern recognition, (1965)
Cover’s theorem
![Page 11: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/11.jpg)
Slide 11
wikipedia.org
By mapping to a simplex, it is apparent that “Every partition of a samples into two sets is separable by a linear separator”
![Page 12: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/12.jpg)
Slide 12
Basic SVM
The support vectors machine finds an optimal separation of the points, whereas a linear classifier, y=a+bx would have an infinite number of possible parameters a and b which would give a working decision boundary.
SVM maximizes the margin between the two classes.
![Page 13: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/13.jpg)
Slide 13
Basic SVM Example: Radial Kernel
Beware of overfitting!
![Page 14: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/14.jpg)
Slide 14
SVM
For a data set 𝑥1, 𝑥2, … , 𝑥𝑁 with target values {𝑦1, 𝑦2, … , 𝑦𝑁}, we aim to minimize
1
2𝑤
2
Subject to𝑦𝑖 𝑤 ⋅ 𝑥𝑖 + 𝑏 = 1
Not optimal if you have over-lapping points belonging to dif-ferent classes near thedecision boundary.
![Page 15: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/15.jpg)
Slide 15
Slack variables in SVM
In order not to be too sensitive to fuzziness close to separation boundary slack variables 𝜉 that allow for some misclassification. For a data set 𝑥1, 𝑥2, … , 𝑥𝑁 with target values {𝑦1, 𝑦2, … , 𝑦𝑁}, we now aim to minimize
1
2𝑤
2+ 𝐶
𝑖
𝜉𝑖
Subject to𝑦𝑖 𝑤 ⋅ 𝑥𝑖 + 𝑏 ≥ 1 − 𝜉𝑖
𝜉𝑖 ≥ 0The cost C is the penalty you pay to do not classify correctly
![Page 16: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/16.jpg)
Slide 16
-regression
![Page 17: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/17.jpg)
Slide 17
-regression
![Page 18: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/18.jpg)
Slide 18
-regressionThe -insensitive loss function, where predictions have to be within an distance of the true value.
![Page 19: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/19.jpg)
Slide 19
-regression
Regression then works similarly to classification with slack-variables. For a data set 𝑥1, 𝑥2, … , 𝑥𝑁 with target values {𝑣1, 𝑣2, … , 𝑣𝑁}, we now aim to minimize
1
2𝑤
2+ 𝐶
𝑖
(𝜉𝑖+𝜉𝑖∗)
Subject to𝑣𝑖 −𝑤 ⋅ 𝑥𝑖 − 𝑏 ≤ 𝜖 + 𝜉𝑖−𝑣𝑖 + 𝑤 ⋅ 𝑥𝑖 + 𝑏 ≤ 𝜖 + 𝜉𝑖
∗
𝜉𝑖 ≥ 0, 𝜉𝑖∗ ≥ 0
![Page 20: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/20.jpg)
Slide 20
SVMs are great general purpose models … with limited possibility for model interpretation
![Page 21: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/21.jpg)
Slide 21
Main advantages of SVM:
Efficient separation of non-linear regions (based on basic dot-products).
Model follows from a quadratic optimization problem, i.e. no risk of ending in a local minimum – in contrast to for example neural networks.
![Page 22: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/22.jpg)
Slide 22
![Page 23: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/23.jpg)
Slide 23
Data science competitions –crowdsourcing platform for companies to pose problems and offer prices for the best predictive model on uploaded data.
![Page 24: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/24.jpg)
Slide 24
https://www.kaggle.com/c/house-prices-advanced-regression-techniques
![Page 25: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/25.jpg)
Slide 25
SVM as a model for house prices in areas with
1000 < Zip Code < 2500
Basic estimate of square meter price as function of coordinates.
![Page 26: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/26.jpg)
Slide 26
Radia
l Kern
al
![Page 27: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/27.jpg)
Slide 27
Poly
nom
ial Kern
al
![Page 28: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/28.jpg)
Slide 28
Change o
f gam
ma in r
adia
l kern
el,
changes p
erf
orm
ance
![Page 29: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/29.jpg)
Slide 29
![Page 30: Support Vector Machines - NBImathies/NBI_SVM2019.pdf · Support Vector Machines Joachim Mathiesen, Niels Bohr Institute. Slide 2 (Over)simplified history 1960-1970s Predominantly](https://reader034.vdocuments.mx/reader034/viewer/2022042219/5ec51d3d68344d46fc2b7b07/html5/thumbnails/30.jpg)
Slide 30
Lab exercise:
• Build a model for house/apartment prices at Østerbro.
• Filter dataI. Zip code = 2100II. Remove entries with sales
price lower than taxation value
III. Only use entries with sqmprice in the range 10kkr to 100kkr.
IV. Remove entries without UTM coordinates.
• Split data in training and validation sets
• Estimate error