Download - Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 1: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Motivation How to do it? Challenges Applications

Machine Learning: Challenges and Applications

Olivier Bousquet, Pertinence

Olivier Bousquet, Pertinence Machine Learning: Challenges and Applications

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Page 3: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Page 4: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Spam Filtering

Page 5: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Example 1

Page 6: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Example 2

Page 7: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Spam Filtering Algorithm

List words that may indicate spam

Take into account possible variations

Count these words/variations in incoming messages

Choose a threshold above which the message is classified asspam

Constantly refine this algorithm as new spam gets ignored, orcorrect messages are rejected (update the list, change thethresholds...)

Page 8: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 9: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 10: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 11: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 12: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Manufacturing Process Control

High Pressure Dye Casting

Inject liquid aluminium in a dye, adjust temperature at variouspositionsTry to avoid bubbles

Page 13: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 14: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 15: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Approach

Build a simulation model

Use Navier-Stokes + phase transitions (solid/liquid) +air/liquid interface, complex geometry, heat transfer...

Require huge computational resources (finite elementsmethods)

No guarantee that the model is correct (what aboutimpurities?)

Page 16: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Approach

Page 17: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Approach

Page 18: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Approach

Page 19: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Is there a better way?

Can we solve such problems in a simpler and more direct way?

Can we build systems that can solve a wide variety of suchproblems (without starting from scratch for each new similarproblem) ?

Idea: Use an empirical approach

Page 20: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 21: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 22: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Page 23: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Induction

Start from Data or Observations

Build a model which

Agrees with the dataPredicts unobserved data

Scientific Method: laws are induced from observations. A lawis correct as long as no experiment contradicts it.

Page 24: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Induction

Build a model which

Page 25: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Induction

Build a model which

Page 26: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Induction

Build a model which

Page 27: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Induction

Build a model which

Page 28: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Ingredients 1

Data representation: choose a way to represent the objects tobe classified

Page 29: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Example

Page 30: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Example

Page 31: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Example

Page 32: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Ingredients 2

Class of functions: choose a way to express the classificationfunction

Page 33: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Example

Page 34: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Overfitting

Data is not always nicely separated

Page 35: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Overfitting

Underfitting

Page 36: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Overfitting

Page 37: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Overfitting

Reasonable model

Page 38: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Ingredients

Preference: incorporate assumptions about the modelregularity

Algorithm: set up the problem as an optimization problem

Page 39: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Ingredients

Page 40: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Ingredients

Page 41: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Formalization

Example: statistical formalization (not the only one)

Data: (X1,Y1), . . . , (Xn,Yn)

Function class: F , f : X → YLoss function: `(f (x), y) = 1[f (x) 6=y ]

Goal: find a function f ∈ F such that E`(f (X ),Y ) is as smallas possible, and f is regular enough

Optimization problem

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Key question: convergence of E`(f̂ (X ),Y ) tominf E`(f (X ),Y )

Page 42: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Formalization

Data: (X1,Y1), . . . , (Xn,Yn)

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Page 43: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Formalization

Data: (X1,Y1), . . . , (Xn,Yn)

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Page 44: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Formalization

Data: (X1,Y1), . . . , (Xn,Yn)

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Page 45: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Formalization

Data: (X1,Y1), . . . , (Xn,Yn)

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Page 46: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Formalization

Data: (X1,Y1), . . . , (Xn,Yn)

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Page 47: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Page 48: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

High Dimensionality

Complexity of the data increases: requires bigger descriptionvectors

Gathering data becomes easier: more and more descriptorsavailable

This increase in the dimensionality comes at a price

Overfitting becomes the main issue (curse of dimensionality)But at the same time, interesting phenomena occur (blessing)

Page 49: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

High Dimensionality

Page 50: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

High Dimensionality

Page 51: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Curses

NP-completeness: optimization problems can be exponentialin the dimension (e.g. search for the separating hyperplanewith minimum number of mistakes)

Statistical issue: exponentially slow convergence

Theorem

Let F be the set of all Lipschitz functions on [0, 1]d . For anyestimator,

supf ∈F

E(f̂ (x)− f (x))2 ≥ Cn−2

2+d as n →∞

Page 52: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Curses

NP-completeness: optimization problems can be exponentialin the dimension (e.g. search for the separating hyperplanewith minimum number of mistakes)

Statistical issue: exponentially slow convergence

Theorem

Let F be the set of all Lipschitz functions on [0, 1]d . For anyestimator,

supf ∈F

E(f̂ (x)− f (x))2 ≥ Cn−2

2+d as n →∞

Page 53: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Blessings

Concentration-of-measure phenomenon

On the n-sphere, for the usual rotationally-invariant measure,most of the mass is concentrated near any (n− 1)-sphere thatis equatorial in it (analogous to a great circle on the Earth’ssurface, when n = 2). In other words, the ’poles’ and theirneighborhoods down to very small latitudes account for a tinyproportion of the ’area’.Chernoff bounds: for i.i.d. bounded random variables

P

[1

n

n∑i=1

Xi − EX ≥ ε

]≤ e−ncε2

General concentration inequality: under appropriate conditionson f (e.g. Lipschitz)

P [f (X1, . . . ,Xn)− Ef ≥ ε] ≤ e−ncε2

Page 54: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Blessings

P

[1

n

n∑i=1

Xi − EX ≥ ε

]≤ e−ncε2

P [f (X1, . . . ,Xn)− Ef ≥ ε] ≤ e−ncε2

Page 55: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Blessings

P

[1

n

n∑i=1

Xi − EX ≥ ε

]≤ e−ncε2

P [f (X1, . . . ,Xn)− Ef ≥ ε] ≤ e−ncε2

Page 56: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Blessings

Random Projections

Lemma (Johnson & Lindenstrauss)

Given a set S of points in Rd , if we perform a random orthogonalprojection of those points on a subspace of dimension m, thenm = O(γ−2 log |S |) is sufficient so that with high probability allpairwise distances are preserved up to a factor 1± γ

→ cheap and easy way to reduce the dimension, while preservingthe geometry

Page 57: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Connections

Theoretical

LogicInformation Theory / CompressionAlgorithmic Information TheoryMathematical Statistics

High dimensional phenomena

Probability in Banach spacesRandom graphs, graph theoryConcentration

Algorithmics

OptimizationApproximate algorithms

Page 58: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Connections

Theoretical

Algorithmics

Page 59: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Connections

Theoretical

Algorithmics

Page 60: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Dealing with Dimensionality

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Important questions

How to choose the norm?

What can be said about convergence?

How can this be implemented?

Page 61: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Important questions

Page 62: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Important questions

Page 63: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

minf ∈F

n∑i=1

`(f (Xi ),Yi ) + λ‖f ‖

Important questions

Page 64: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Choosing the norm

Typical approach: linear methods

f (x) =d∑

i=1

αixi

L2 norm:by duality everything can be expressed in terms of innerproducts

d∑i=1

x iz i

this allows to generalize to reproducing kernel Hilbert spaces

L1: ensures sparsityonly a few αi will be non-zerothis allows to deal with very high-dimensional representations

Page 65: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Choosing the norm

f (x) =d∑

i=1

αixi

d∑i=1

x iz i

Page 66: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Choosing the norm

f (x) =d∑

i=1

αixi

d∑i=1

x iz i

Page 67: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Choosing the norm

f (x) =d∑

i=1

αixi

d∑i=1

x iz i

Page 68: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Choosing the norm

f (x) =d∑

i=1

αixi

d∑i=1

x iz i

Page 69: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Choosing the norm

f (x) =d∑

i=1

αixi

d∑i=1

x iz i

Page 70: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Choosing the norm

f (x) =d∑

i=1

αixi

d∑i=1

x iz i

Page 71: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

L2: Kernels

Idea: replace inner products by positive definite kernels

Three reasons why this is interesting:

a good way to measure smoothness in high dimensionsallows to deal with complex objectseasy way to non-linearize

Also, interesting geometric properties (balls are effectively lowdimensional in L2(P))

Page 72: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

L2: Kernels

Page 73: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

L2: Kernels

Page 74: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

L1: Boosting

Use sparsity to introduce many dimensions

Given a set H of interesting features, replace x by (h(x))h∈H

Build a linear combination with small L1 norm

Effective algorithms (e.g. Adaboost), geometry notsignificantly altered by replacing H by its convex hull

Page 75: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

L1: Boosting

Page 76: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

L1: Boosting

Page 77: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

L1: Boosting

Page 78: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Convergence Properties

Concentration ensures

supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )−E`(f (X ),Y ) ≈ E

[supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )− E`(f (X ),Y )

]

Expectation term can be written as E‖∑

Xi‖Key concept: geometry induced by F and the distribution,”size” of F can be measured by metric entropy, Rademacheraverages, majorizing measures,...

Page 79: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )−E`(f (X ),Y ) ≈ E

[supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )− E`(f (X ),Y )

]

Page 80: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )−E`(f (X ),Y ) ≈ E

[supf ∈F

1

n

n∑i=1

`(f (Xi ),Yi )− E`(f (X ),Y )

]

Page 81: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Computational Tricks

Typically use convex optimization (but number of variables can behuge)

Kernels: fast way of computing inner products

Boosting: few relevant features, fast selection of mostimportant feature

Random Projections: cheap and easy way to reducedimensionality

Page 82: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 83: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 84: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Page 85: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Outline

1 Motivation

2 How to do it?

3 Challenges

4 Applications

Page 86: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

To be discussed next

A wide variety of applications

Future directions

Page 87: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

To be discussed next

A wide variety of applications

Future directions

Page 88: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Recommendation Engines

Page 89: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

News Categorization

Page 90: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

News

Page 91: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

News

Page 92: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

News

Page 93: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Autonomous Vehicles

Page 94: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Other Applications

Bioinformatics

Recognition problems (speech, handwritten text, images...)

Data Mining

Search Engines

...

Page 95: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Other Applications

Bioinformatics

Data Mining

Search Engines

...

Page 96: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Other Applications

Bioinformatics

Data Mining

Search Engines

...

Page 97: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Other Applications

Bioinformatics

Data Mining

Search Engines

...

Page 98: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Other Applications

Bioinformatics

Data Mining

Search Engines

...

Page 99: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Machine Learning Can Make You Rich

Mathematical Finance is the scientific topic of choice forgetting a good job

Machine Learning is not too bad either these days!

Recruiting companies: big shots of the internet betting heavilyon this technology

Google / Yahoo! / MicrosoftAlso some start-ups: Pertinence

Money Prizes

DARPA Grand Challenge 2005: autonomous vehicle $ 2M(http://www.grandchallenge.org)Netflix Prize 2006: movie ratings $ 1M(http://www.netflixprize.com)Hutter Prize: compression of web documents $ 50K(http://prize.hutter1.net/)

Page 100: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Money Prizes

Page 101: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Money Prizes

Page 102: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Money Prizes

Page 103: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Money Prizes

Page 104: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Money Prizes

Page 105: Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Future Directions

Practically

Massive databases: require specific algorithmsMore and more structure: e.g. machine translationMultimedia: e.g. video streams

Theoretically

Better connection betweenapproximation/estimation/computationBetter understanding of model selection (e.g. cross-validation)Formalization and understanding of feature selection

Download - Olivier Bousquet, Pertinence - Machine Learning Thoughtsml.typepad.com/Talks/ob_marseille.pdf · Motivation How to do it? Challenges Applications Machine Learning: Challenges and

Top Related