applied machine learning...

APPLIED MACHINE LEARNING

1

Applied Machine Learning

Introduction


Practicalities

Contact Information of the Instructors


Practicalities

Slides and exercises will be posted on the website of the class the day before class: http://lasa.epfl.ch/teaching/lectures/ML_Msc/index.php http://lasa.epfl.ch/ Teaching Lectures Applied Machine Learning

Solutions to the exercises will be posted a week after the exercise session.

http://lasa.epfl.ch/teaching/lectures/ML_Msc/index.php


5

Class Format

• Lectures: 9h15-11h00

• Exercises (In class): 11h15-12h00

• Lectures alternates with practice sessions held in

INN 218, see class schedule.

• Attendance to the practice sessions is compulsory.

• Attendance to exercise sessions is highly recommended….


6

TP / Practicals:

Group formation https://epfl.doodle.com/poll/ggwdf99a3nwk3ad8

(the link will be sent to you by email)

The first practical starts on week 3!

https://epfl.doodle.com/poll/ggwdf99a3nwk3ad8


7

Class Syllabus

http://lasa.epfl.ch/teaching/lectures/ML_Msc/index.php


8

Grading Scheme

Practicals (25% of the grade) Performed in team of 2 or 3 people 2 reports – due October 16 and November 20 1 oral presentation – December 11 Written Exam (75% of the grade) 3 hours long Closed book Allowed 1 A4 pages with handwritten notes


9

Pre-requisites

Linear Algebra Vector / Matrix notation Eigenvalue decomposition Linear dependency

Probability / Statistics Probability Distribution Function Covariance, Expectation Joint, conditional probability Correlation / Statistical Independence

Optimization Global versus local optima Gradient descent Method of Lagrange Multipliers

Brief recap of main algorithms in class


10

- To understand the basics of some key algorithms of Machine

Learning

- To apply some of these algorithms with real data and, by so doing, to understand the limitations of the algorithm for real-time systems

- To raise in you enough interest for the field, so that you will later try to learn more about it (advanced class at the doctoral school, search on-line, …)

- To have more engineers apply these techniques for robust control, signal processing, prediction, learning, etc.

Class Objectives


11

Main learning outcomes: By the end of the course, the student must be able to: • Choose an appropriate ML method • Assess / Evaluate an appropriate ML method • Apply an appropriate ML method Transversal skills • Write a scientific or technical report. • Make an oral presentation.

Learning Outcomes


12

Today’s class format

• Taxonomy and basic concepts in ML

• Examples of ML applications

• Introduction to best practice in ML


13

What is Machine Learning?

Machine Learning encompasses a large set of algorithms that aim at inferring information from what is hidden.

This process is often referred to as Data Mining.


14


Machine Learning encompasses a large set of algorithms that aim at inferring information from what is hidden.

A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, "On separation of semitransparent dynamic images from static background", Proc. Intl. Conf. on Independent Component Analysis and Blind Signal Separation, pp. 934-940, 2006.

Independent Component Analysis (ICA) can decompose mixture of signals


15


Recognizing human speech. Here this is the wave produced when uttering the word “allright”.

The strength of ML algorithms is that they can apply to arbitrary data. They can recognize patterns from various sources of data.

Modeling time series: Hidden Markov Models be used to recognize complex sounds, including human speech.


16


Piano note (C5 – do) Same note played by an oboe (hautbois)

Classification: Two patterns that are different should still be grouped in the same class


17


3 sample of 2-dimensional trajectories of a robot’s hand along time.

Same trajectories displayed in space(without time).

ML algorithms are often used to find the representation in which patterns that originally looked different, become similar.


18


3 sample of 2-dimensional trajectories of a robot’s hand along time.

Same trajectories displayed in space(without time).

ML algorithms are often used to find the representation in which patterns that originally looked different, become similar.

Principal Component Analysis can discover this projection First algorithm we will see in class


19

What is Machine Learning? Helps compute automatically information that would take days to do by hand.

Noris, B., Nadel, J, Barker, M., Hadjikhani, N. and Billard, A. (2012) Investigating gaze of children with ASD in naturalistic settings. PLOS ONE.

The mapping can be done through support vector regression An algorithm we will see in class


20

• 1940 – The Perceptron (Pitts & MacCulloch) !!

• 1960 – The Perceptron (Minsky & Papert) !!

• 1960 – “Bellman Curse of Dimensionality”

• 1980 – Bounds on statistical estimators (C. Stone)

• 1990 – Beginning of high dimensional data (Hundreds variables)

• 2000 – High dimensional data (Thousands variables)

Machine Learning, History


21

The history traces back to the very first step in Artificial

Intelligence (AI). Machine Learning is a new way of thinking in AI that

builds strongly on statistics: - probabilistic approach to modeling decision function - modeling of uncertainty in input, output and in the

parameters of the model! - data-driven approach as opposed to “knowledge-

driven”



22

1980 – The First Machine Learning Workshop was held at Carnegie-Mellon University in Pittsburgh.

1980 – Three consecutive issues of the International Journal of Policy Analysis and Information Systems were specially devoted to machine learning.

1981 - Hinton, Jordan, Sejnowski, Rumelhart, McLeland at UCSD Back Propagation alg. PDP Book

1986 – The establishment of the Machine Learning journal. 1987 – The beginning of annual international conferences on machine learning

(ICML). Snowbird ML conference 1988 – The beginning of regular workshops on computational learning theory

(COLT). 1990’s – Explosive growth in the field of data mining, which involves the

application of machine learning techniques.



23

Why and when do we need learning in Robotics?


24

Peg and Hole Problem

A typical problem of Robotics


25


Peg and Hole Problem


26

A: Engineer the environment



27


B: Engineer the body



28



C: Engineer the controller

Systematic search

Adaptive control

Learning Machine!



29

Machine Learning: definitions

Machine Learning is the field of scientific study that concentrates on induction algorithms and on other algorithms that can be said to ``learn.'' Machine Learning Journal, Kluwer Academic

Machine Learning is an area of artificial intelligence involving developing techniques to allow computers to “learn”. More specifically, machine learning is a method for creating computer programs by the analysis of data sets, rather than the intuition of engineers. Machine learning overlaps heavily with statistics, since both fields study the analysis of data. Webster Dictionary

Machine learning is a branch of statistics and computer science, which studies algorithms and architectures that learn from data sets. WordIQ


30

To engineer the environment is not always desirable

Rather, it is desirable to have a system that is

Adaptable to different environments

That can generalize across tasks

Kronander, Burdet and Billard, Learning PegI n Hole Insertionf rom Human Demonstrations, 2013


31

To engineer the environment is not always desirable

Rather, it is desirable to have a system that is

adaptable to different environments

can generalize across tasks

Machines that learn


34

Problem I



C: Engineer the controller

ROWS 4-6

Make an autonomous robot car that can drive people from

Lausanne to Geneva

ROWS 1-3

Design an autonomous robot that distributes graded assignments

to a class of students


35

Generalization versus memorization

An important feature of a learning system

that differentiates it from a pure “memory” is its ability to generalize.

Key features for a good learning system


36

Extracting features and generalizing

Calinon, D’halluin & Billard, Handling of multiple constraints and motion alternatives in a robot programming by demonstration framework, Humanoids’09.


37

Learning versus Memorization

Learning implies generalizing. Generalizing consists of extracting key features from the data, matching

those across data (to find resemblances) and storing a generalized representation of the data features that accounts best (according to a given metric) for all the small differences across data.

Generalizing is the opposite of memorizing and often one might want to

find a tradeoff between over-generalizing, hence losing information on the data, and over fitting, i.e. keeping more information than required.

Generalization is particularly important in order to reduce the influence

of noise, introduced in the variability of the data.


38

Imagine that you have been a witness to a robbery.

How would you describe the thief to the police?

Problem II


39

Feature extraction consists of extracting only a subset, the

key components, of a situation, image, scene, conversation, etc.

Dimensionality reduction The key components depends on the context. Feature extraction presupposes the existence of a

mechanism and of background knowledge to restore the complete situation, image, scene, conversation.

Features extraction


40

Features extraction Bio-Inspiration

Extracting the relevant information Preattentive processing of orientation

Sensitive to asymmetries

Nothdurft, H.-C. (1991a). Ophthalmology and Visual Science, 32(4), 714.


41


Our eyes pick up rapidly what stands out: asymmetries or uncoordinated pattern of motion


42


A Model of Saliency-Based Visual Attention for Rapid Scene Analysis. Itti et al 2008


44

Features extraction Bio-Inspired IT Systems

Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub , Ruesch et al, ICRA 2008

http://www.youtube.com/watch?v=AWZxvE5g5Pw&feature=player_detailpage


45

Problem III

You must solve one of the two problems below.

You are provided with the following data:

Age, gender, nationality, body weight, hair color, favorite hobbies, weekly timetable of all EPFL students and EPFL professors

Which data do you remove from the input to your system?

ROWS 4-6

Predict number of people in the metro between 8h00 and 8h15

ROWS 1-3

Predict number of meals sold by each EPFL cafeteria at lunch


46

Machine learning algorithms are often classified according to the type of computation they can perform on a given dataset. Common types of computation include: • Classification: learn to put instances into pre-defined classes

• Association: learn relationships between the attributes

• Clustering: discover classes of instances that belong together

• Time series prediction: learn to predict a numeric quantity instead of a class

Other terms: On-line Learning, Connectionist Models

Taxonomy in ML


47

• Supervised learning – where the algorithm learns a function or model that maps a set of inputs to a set of desired outputs.

• Unsupervised learning – where the algorithm learns a model that represents a set of inputs without any feedback (no desired output, no external reinforcement).

• Reinforcement learning – where the algorithm learns a mechanism that generates a set of outputs from one input in order to maximize a reward value (external and delayed feedback).

• Learning to learn – where the algorithm learns its own inductive bias based on previous experiences

Taxonomy in ML


48

• Supervised learning relates to a vast group of methods by which one estimates a model from a set of examples,

The system is given the desired output.

• When these examples are provided by a human expert,

this is referred to robot learning from demonstration; robot programming by demonstration.

Supervised learning


( ){ }( )

2Goal: estimate function of state evolution , ,

Make observations of the state of the system , , 1... .

Estimate through mean-square error: min

i i

i i

i

x f x x x

N x x i N

f f x x

= ∈

=

−∑

1x

2x x*: target

Supervised learning: example


Supervised learning

Khansari Zadeh, S. M., Kronander, K. and Billard, A. (2012) Learning to Play Minigolf. Advanced Robotics.


51 Noris, B., Nadel, J, Barker, M., Hadjikhani, N. and Billard, A. (2012) Investigating gaze of children with ASD in naturalistic settings. PLOS ONE.

Supervised learning

Where do the eyes look?

Map image of the eyes to point in the camera image


52

What is Machine Learning? What is sometimes impossible to see for humans is easy for ML to pick.

Noris, B., Keller, J-B. and Billard, A. (2011) A Wearable Gaze Tracking System for Children in Unconstrained Environments. Computer Vision and Image Understanding.

Exploit information not only on the pupil, cornea, but also on wrinkles, eyelids and eyelashed pattern to infer gaze direction.

Support Vector Regression can be used to learn this mapping


53


Noris, B., Keller, J-B. and Billard, A. (2011) A Wearable Gaze Tracking System for Children in Unconstrained Environments. Computer Vision and Image Understanding.

Support Vector Regression can be used to learn this mapping

20 20 , 1...50i xx i∈ =

Input: 50 images of the eyes, In grey color 20x20 pixels

Output: 50 images of the scene, In grey color 240x320 pixels

( )y f x= 240 320 , 1...50i xy i∈ =Learn a function f:


54

Unsupervised Learning

Unsupervised learning refers to a variety of methods by which a pair of signals y and x are associated but there is no explicit labeling as to which y should be associated to which x. This is often done through association, i.e. through associative learning.


55

Associative Learning

Let us play a little game.

When I say “satelite”, what do you think of?

When I say “EPFL”, what do you think of?

When I say “exam”, what do you think of?


56

Associative Learning

Associative learning is multi-modal: Visual, auditory, proprioceptive, etc. It can represent - A concatenation of features Satelite Comics, Beer - Events bearing a close temporality EPFL TSOL, Classes - A cause-to-effect relationship Exam Grade, Vacation…


57

Associative Learning for Learning Word-Objects Relations

H. Kozima, H. Yano, A robot that learns to communicate with human caregivers, in: Proceedings of the International Workshop on Epigenetic Robotics, 2001


58

Reinforcement Learning (RL)

RL tries to infer the optimal path to the goal,

through a process of Trial-and-error, so as to maximize the reward.

Reinforcement learning is a tedious learning

method. It is slow and is functional only in well-defined

problems with small search space.


59

Reinforcement Learning Robotics Applications

Atkeson & Schaal, Learning how to swing an inverted pendulum, ICRA 1997.

The robot tracks the position of the two colored ball. Using a model of the inverse pendulum dynamics, it learns which joint angle displacement to produce to ensures that the ball remains in equilibrium.


60

Reinforcement Learning Robotics Applications

Kober, J.; Peters, J. (2011). Policy Search for Motor Primitives in Robotics, Machine Learning, 84, 1-2, pp.171-203.

Learning the “ball in a cup” task; A couple of successful examples are provided by a human; the robot then performs guided trials so as to optimize the reward (putting the ball in the cup).

http://www.youtube.com/watch?v=Fhb26WdqVuE


62

Problem IV

Supervised learning, unsupervised learning or reinforcement learning?

1: Learning to ride a bicycle 2: Predicting the length of the queue at the cafeteria 11h30-12h30 3: Relating demographics (age/weight/nationality) with chocolate consumption 4: Estimating monthly returns of stocks in the financial market


63

Summary

Machine Learning encompasses a large area of works which cannot all be covered here. We will focus on a subset of algorithms that form the foundation of most current advances in machine learning. We however omit several topics, which are covered in other courses on machine learning at EPFL.


64

Some Machine Learning Resources

On-line resources: • http://www.machinelearning.org/index.html • http://www.pascal-network.org/ Network of excellence on Pattern Recognition, Statistical Modelling and Computational Learning (summer schools and workshops) Journals: • Machine Learning Journal, Kluwer Publisher • IEEE Transactions on Signal processing • IEEE Transactions on Pattern Analysis • IEEE Transactions on Pattern Recognition • The Journal of Machine Learning Research

Conferences: • ICML: int. conf. on machine learning • Neural Information Processing Conference – on-line repository of all research papers, www.nips.org


65

LASA


66

Software for the class

�Applied Machine Learning��Introduction Slide Number 3Slide Number 4Slide Number 5Slide Number 6Slide Number 7Slide Number 8Slide Number 9Slide Number 10Slide Number 11Slide Number 12Slide Number 13Slide Number 14Slide Number 15Slide Number 16Slide Number 17Slide Number 18Slide Number 19Slide Number 20Slide Number 21Slide Number 22Slide Number 23Slide Number 24Slide Number 25Slide Number 26Slide Number 27Slide Number 28Slide Number 29Slide Number 30Slide Number 31Slide Number 34Slide Number 35Slide Number 36Slide Number 37Slide Number 38Slide Number 39Slide Number 40Slide Number 41Slide Number 42Slide Number 44Slide Number 45Slide Number 46Slide Number 47Slide Number 48Slide Number 49Slide Number 50Slide Number 51Slide Number 52Slide Number 53Slide Number 54Slide Number 55Slide Number 56Slide Number 57Slide Number 58Slide Number 59Slide Number 60Slide Number 62Slide Number 63Slide Number 64Slide Number 65Slide Number 66

applied machine learning...

Documents