introduction to machine learning€¦ · introductiontomachinelearning gillesgasso insa rouen - asi...

32
Introduction to Machine Learning Gilles Gasso INSA Rouen - ASI Departement LITIS Lab September 5, 2019 Gilles Gasso Introduction to Machine Learning 1 / 32

Upload: others

Post on 27-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Introduction to Machine Learning

Gilles Gasso

INSA Rouen - ASI DepartementLITIS Lab

September 5, 2019

Gilles Gasso Introduction to Machine Learning 1 / 32

Page 2: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Machine Learning: introduction

Machine Learning ≡ data-based programmingability of computers to learn how to perform tasks (classification,detection, translation. . . ) without being explicitly programmed

study of algorithms that improve their performance at some task basedon experience

Gilles Gasso Introduction to Machine Learning 2 / 32

Page 3: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

The rise

DataBig Data : continuous increase in data generated

Twitter : 50M tweets /day (=7 terabytes)Facebook : 10 terabytes /dayYoutube : 50h of uploaded videos /minute2.9 millions of e-mails /second

Computing PowerMoore’s lawMassively distributedcomputing

The value-adding processInterest: from product to customers.Data Mining ≡ discovering patterns inlarge data sets

Gilles Gasso Introduction to Machine Learning 3 / 32

Page 4: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Historical perspective

Today’s IA is Deep Learning (a technique of Machine Learning)

https://blog.alore.io/machine-learning-and-artificial-intelligence/

Gilles Gasso Introduction to Machine Learning 4 / 32

Page 5: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Applications: sentiment analysis

Classify stored reviews according to users’ sentiment

Score(x) > 0

Score(x) < 0

Gilles Gasso Introduction to Machine Learning 5 / 32

Page 6: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Product recommendation

Product features

User features

Matrix factorization Purchase history of customers

https://katbailey.github.io/images/matrix_factorization.png

Purchased item Recommended products

Gilles Gasso Introduction to Machine Learning 6 / 32

Page 7: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Image classification

Labeled training images Deep classification architecture

https://www.researchgate.net/profile/Y_Nikitin/publication/270163511/figure/download/fig5/AS:295194831409153@1447391340221/MPCNN-architecture-using-alternating-convolutional-and-max-pooling-layers-13.png

Input images and predicted category

https://adriancolyer.files.wordpress.com/2016/04/imagenet-fig4l.png?w=656

Gilles Gasso Introduction to Machine Learning 7 / 32

Page 8: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Medical diagnosis

Malignant melanoma detection130 000 images including over 2 000 cases of cancerError rate 28 % (human 34 %)

Digital Mammography DREAM Challenge640 000 mammographies (1209 participants)false-positive rate decreased by 5 %

Heart rhythm analysis500 000 ECGaccuracy 92.6 % (human 80.0 %) sensitivity of 97 %

Gilles Gasso Introduction to Machine Learning 8 / 32

Page 9: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Object detection

https://www.mdpi.com/applsci/applsci-09-01128/article_deploy/html/images/applsci-09-01128-g004.pn

https://arxiv.org/pdf/1506.02640.pdf

Gilles Gasso Introduction to Machine Learning 9 / 32

Page 10: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Audio captioning

https://ars.els-cdn.com/content/image/1-s2.0-S1574954115000151-gr1.jpg

Gilles Gasso Introduction to Machine Learning 10 / 32

Page 11: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Chatbot

https://miro.medium.com/max/2058/1*LF5T9fsr4w2EqyFJkb-gng.png

Gilles Gasso Introduction to Machine Learning 11 / 32

Page 12: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Implementing a Machine Learning project

Real World

IndustriesInsuranceInternet

Social NetworksMedical

. . .

DataGathering

SensorsTransactions

Web-clicks,logsMobility

Open dataDocs

. . .

DataStrength-ening

CleaningOrganizingAggregationManagement

of errors

. . .

Dataengineering

ExplorationRepresentation

DisplayMachine Learning

Data Mining

. . .

Business

InterpretationPattern ofinterests

Strategic decisionValuation

. . .

Gilles Gasso Introduction to Machine Learning 12 / 32

Page 13: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Chain of the data engineering process

DataPre-

processingTrain themodel

Evaluatethe

modelModel

1 Understand and specify projectgoals

2 Pre-processing/visualize/analyzedata

3 Which ML problem is it?4 Design a solving approach5 Evaluate its performance6 Go to to 2) if needed

1

1

1

1 1

1

1

1

1

1

1

Y

? 1

1

Course Goal: Study the steps from 2 to 5

Gilles Gasso Introduction to Machine Learning 13 / 32

Page 14: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

The data

Information (past experience) are examples with attributesAssume the data set consists of N samples

AttributesAn attribute is a property or characteristic of a phenomenon beingobserved. Also called it feature or variable

SampleIt is an entity characterising an object; it is made up of attributes.Synonyms : instance, point, vector (usually in Rd)

Gilles Gasso Introduction to Machine Learning 14 / 32

Page 15: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Data : visualizationhhhhhhhhhPoints x

Features citric acid residual sugar chlorides sulfur dioxide

1 0 1.9 0.076 112 0 2.6 0.098 253 0.04 2.3 0.092 15

Point x ∈ R4 0.56 1.9 0.075 175 0 1.9 0.076 116 0 1.8 0.075 137 0.06 1.6 0.069 158 0.02 2 0.073 99 0.36 2.8 0.071 1710 0.08 1.8 0.097 15

1.5 2 2.5 30.065

0.07

0.075

0.08

0.085

0.09

0.095

0.1

Variable 2 : Residual Sugar

Variable

3 : C

hlo

rides

Points

Moyenne des points

1.61.8

22.2

2.42.6

2.8

0.06

0.07

0.08

0.09

0.1

10

15

20

25

Variable 2 : Residual Sugar

Variable 3 : Chlorides

Variable

4 : S

ulfur

Gilles Gasso Introduction to Machine Learning 15 / 32

Page 16: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Data types

Sensors → Quantitative andqualitative variables,ordinales, nominals

Text → StringSpeech → Time SeriesImages → 2D DataVideos → 2D Data + time

Networks → GraphsStream → Logs, coupons. . .Labels → Expected output prediction

Gilles Gasso Introduction to Machine Learning 16 / 32

Page 17: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Approaches of Machine Learning

Gilles Gasso Introduction to Machine Learning 17 / 32

Page 18: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Supervised learning

Principle

Given a set of N training examples {(xi , yi ) ∈ X × Y, i = · · · ,N}, wewant to estimate a prediction function y = f (x).

The supervision comes from the label knowledge

ExamplesImage classification, object detection, stock price prediction . . .

Gilles Gasso Introduction to Machine Learning 18 / 32

Page 19: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Unsupervised learning

Principle

Only the {xi ∈ X , i = · · · ,N} are available. We aim to describe howdata is organized and extract homogeneous subsets from it.

ExamplesApplications: Customers segmentation, image segmentation, datavisualization, categorization of similar documents . . .

-60 -40 -20 0 20 40 60 80-80

-60

-40

-20

0

20

40

60

80

00

00

00 00

000

0

0

0 0

000

0

00

00

00

0

0

0

0 0

0

00

0 0

000

00000

0

00 00 00

00

0

00

0

0

000 00

000

00

0000

0

00 0

0

000

00

0000

0000

0

0

0000

0

00

00

1

111

1

1

1

11

11

111

11

11

1

11

1

11

1

11

11

11

1

1

1

1111

111

11111

111

1

1

11

1

111

11

11

1

1

1

1

111

1

11

1 11

11

11 1

1

1

1

1

1

1

1

11

11

1

1

1

11

1

1

1 1

1

2

22

2

2

2

22

2

2 22

2

2

2

2

222 2

222

2

2

2

2

22 2

2

222

2

22

2

2 2

2

2

22

2

2

22

2 2

222

2

222

2

2

2

2

2

2

222

22

2

2

2

2

2

2 2

2

2

2

2

2

2

22222

2

2

2 2

2

22222

2

2

22

3 3

3

3

3

3

33

3 33

33

3

3

33

333

3

333

3

33

3

3

33 3

3

3

33

3

3

3

3

3

33

3

3

3

3

3

33

3

3

3

3

333

3

3

33 33

3

3

3333

3

3

33

3

3

3

3

3

333

33

3

3

33

3

3

3

33 33

3

3

33

33

4

4

4

4

4

44

4

4

4

4

4 4444

4

4

4

4

4

4

4

444

4

44

44

4

4

4

4 444

44

4

44

4

44

4

4

4

4

4444

4

4

4

4

4

4

44

444 4

4

4

44

4

4

4

4

4

4

4

4

4

4

4

4

4

4

4

4

4

4

44

44

4

4

4

44 4

44

5

555555

5

5 55

5

5

5

55

5

5

5

5

5

5

5

5

5

55

55

5

55

5

5

5

5

5

5

5

5

5

5

5

5

5

555

5

5

5

5

5

55

5

55

5

55

5

5

55

555

5

5

5

55555

55

5

5

5

5

5

5

5

5

5

5

5

5

5

5

5

5

5

5

55

5

56

6

6

66

6 6666

666 6

6 666

6

66 66

6

6

6

6

66

6

66

6 6

6

66

66

6 66

6

6

6

66 6

6 66

6

6

6

66

6

6

6

66

66666666

6

6

66

66 6

66

6 66

6

6

6

6

6

6

66

66

6

6

6

66

6

66

6

77

77

777

7

7

7

777

7

7

7

7

7

77

7

7

7

77 7

7

7

7

777

7

7

7

77

7

77

7

77

7

7

7

7

7

7

7

7

7

7

7

7

7

7

7

7

777

7

7

77

7

77

7

7

7

7

77

77

7

77

7

77

7

7

7

7 77

7

77

7

7 7

77 7

77

88

8 888 888

8

88 8

88

88

8

8 8

8

8

88

88

8

8

8

8

8

888

8

888 8

8

88

8888 8

8

8

8

8

8

88 88

8

8

88888

888 88 8

8

8 88

8 888

888

8

88

8

8

88

8

8

8

8

8

8

8 8

88

888

999

9

99

9

9

9

9

9

99

99

9

9

9

9

9

9

99

9

99

9

99

9

9

9

9

9

9999

999

9

99999

9

99

9

9

9

9

99

9

99

999

9 99

9

99

99 9 9

9

9 99

9

9 9

9

9

9

9

9

999

9

9

9

9

9

9

9

99

99

9

9

Gilles Gasso Introduction to Machine Learning 19 / 32

Page 20: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Supervised learning : concept

Let X and Y be two sets. Assume p(X ,Y ) the joint probabilitydistribution of (X ,Y ) ∈ X × Y.

Goal : find a prediction function f : X → Y which correctly estimatesthe output y corresponding to x .

f belongs to a space H called hypothesis class. Example of H : setof polynomial functions

Gilles Gasso Introduction to Machine Learning 20 / 32

Page 21: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Supervised learning: principle

Loss function L(Y , f (X ))

evaluates how ”close” is the prediction f (x) to the true label y

it penalizes errors: L(y , f (x)) ={

0 if y = f (x)≥ 0 if y 6= f (x)

True risk (expected prediction error)

R(f ) = E(X ,Y )[L(Y , f (X ))] =

∫X×Y

L(y , f (x))p(x , y)dxdy

ObjectiveIdentify the prediction function which minimizes the true risk i.e.

f ? = arg minf ∈H

R(f )

Gilles Gasso Introduction to Machine Learning 21 / 32

Page 22: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Loss functions

Quadratic loss : L(Y , f (X )) = (Y − f (X ))2

`1 loss (absolute deviation): L(Y , f (X )) = |Y − f (X )|

0− 1 loss: L(y , f (x)) ={

0 if y = f (x)1 otherwise

Hinge loss: L(y , f (x)) = max(0, 1− yf (x))

Gilles Gasso Introduction to Machine Learning 22 / 32

Page 23: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Some supervised learning problems

RegressionWe talk about regression whenY is a subset of Rd .

Usual related loss function:quadratic loss (y − f (x))2

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5−1.5

−1

−0.5

0

0.5

1Support Vector Machine Regression

x

y

Gilles Gasso Introduction to Machine Learning 23 / 32

Page 24: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Some supervised learning problems

ClassificationOutput space Y is an un-ordered discrete setBinary classification: card(Y) = 2

Example: Y = {−1, 1}Loss functions: 0-1 loss, hinge loss

Multiclass classification: card(Y) > 2Example: Y = {1, 2, · · · ,K}

Gilles Gasso Introduction to Machine Learning 24 / 32

Page 25: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

From true risk to empirical risk

Minimizing the true risk is not duable (in allmost all practical applications)

The joint distribution p(X ,Y ) is unknown!

Only a finite training set {(xi , yi ) ∈ X × Y}Ni=1 is available

Empirical risk

Remp(f ) =1N

N∑i=1

L(yi , f (xi ))

Empirical risk minimization

f̂ = arg minf ∈H

Remp(f )

Gilles Gasso Introduction to Machine Learning 25 / 32

Page 26: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Overfitting

OverfittingEmpirical risk is not appropriate for model selection: if H is large enough,Remp(f )→ 0 but the generalized error (true risk) is high.

Err

eur

de

pre

dic

tio

n

Ensemble de Test

Ensemble d’apprentissage

Faible ElevéComplexité du modèle

Gilles Gasso Introduction to Machine Learning 26 / 32

Page 27: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Illustration of overfitting

Valeurs decroissantes de k0102030

Ris

que

empi

rique

0

0.1

0.2

0.3

AppValTest

https://www.cs.princeton.edu/courses/archive/spring16/cos495/slides/ML_basics_lecture6_overfitting.pdf

Gilles Gasso Introduction to Machine Learning 27 / 32

Page 28: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Model selection

Find in H the best function f that learned based on the training setwill well generalize (low true risk)

Example : We are looking for a polynomial function of degree αminimizing the risk : Remp(fα) =

∑Ni=1(yi − fα(xi ))

2.

Goal :1 propose a model estimation method in order to choose (approximately)

the best model belonging to H .2 once the model is selected, estimate its generalization error.

Gilles Gasso Introduction to Machine Learning 28 / 32

Page 29: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Model selection : basic approache

Case 1 : N is really big (large scale DN )

Validation Apprentissage {X app ,Y app }

Données disponibles

Test {X test ,Y test }{X val ,Y val}

1 Randomly split DN = Dtrain ∪ Dval ∪ Dtest

2 For each α, train fα based on Dtrain

3 Evaluating its performance on Dval Rval =1

Nval

∑i∈Dval

L(yi , f (xi ))

4 Select the model with the best performance on Dval

5 Test selected model on Dtest

NoteDtest is used once!

Gilles Gasso Introduction to Machine Learning 29 / 32

Page 30: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Model selection : Cross-validation

Case 2 : Small or medium scale DN

Estimate the generalization error by re-sampling.

Principle

1 Split DN into K sets of equal size.

2 For each k = 1, · · · ,K , train a model by using the K − 1 remainingsets and evaluate the model on the k-th part.

3 Average the K error estimates obtained to have the cross-validationerror.

TestApprentissageValidation

Validation TestApprentissage Apprentissage

Validation TestApprentissage

Gilles Gasso Introduction to Machine Learning 30 / 32

Page 31: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Conclusions

To successfully carry out an automatic data processing projectClearly identify and spell out the needs.

Create or obtain data representative of the problem

Identify the context of learning

Analyze and reduce data size

Choose an algorithm and/or a space of hypotheses

Choose a model by applying the algorithm to pre-processed data

Validate the performance of the method

Gilles Gasso Introduction to Machine Learning 31 / 32

Page 32: Introduction to Machine Learning€¦ · IntroductiontoMachineLearning GillesGasso INSA Rouen - ASI Departement LITIS Lab September5,2019 Gilles Gasso Introduction to Machine Learning

Au final ...

Find your way. . .

Gilles Gasso Introduction to Machine Learning 32 / 32