applied machine learning...

61
APPLIED MACHINE LEARNING 1 Applied Machine Learning Introduction

Upload: others

Post on 22-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

  • APPLIED MACHINE LEARNING

    1

    Applied Machine Learning

    Introduction

  • APPLIED MACHINE LEARNING

    Practicalities

    Contact Information of the Instructors

  • APPLIED MACHINE LEARNING

    Practicalities

    Slides and exercises will be posted on the website of the class the day before class: http://lasa.epfl.ch/teaching/lectures/ML_Msc/index.php http://lasa.epfl.ch/ Teaching Lectures Applied Machine Learning

    Solutions to the exercises will be posted a week after the exercise session.

    http://lasa.epfl.ch/teaching/lectures/ML_Msc/index.php

  • APPLIED MACHINE LEARNING

    5

    Class Format

    • Lectures: 9h15-11h00

    • Exercises (In class): 11h15-12h00

    • Lectures alternates with practice sessions held in

    INN 218, see class schedule.

    • Attendance to the practice sessions is compulsory.

    • Attendance to exercise sessions is highly recommended….

  • APPLIED MACHINE LEARNING

    6

    TP / Practicals:

    Group formation https://epfl.doodle.com/poll/ggwdf99a3nwk3ad8

    (the link will be sent to you by email)

    The first practical starts on week 3!

    https://epfl.doodle.com/poll/ggwdf99a3nwk3ad8

  • APPLIED MACHINE LEARNING

    7

    Class Syllabus

    http://lasa.epfl.ch/teaching/lectures/ML_Msc/index.php

  • APPLIED MACHINE LEARNING

    8

    Grading Scheme

    Practicals (25% of the grade) Performed in team of 2 or 3 people 2 reports – due October 16 and November 20 1 oral presentation – December 11 Written Exam (75% of the grade) 3 hours long Closed book Allowed 1 A4 pages with handwritten notes

  • APPLIED MACHINE LEARNING

    9

    Pre-requisites

    Linear Algebra Vector / Matrix notation Eigenvalue decomposition Linear dependency

    Probability / Statistics Probability Distribution Function Covariance, Expectation Joint, conditional probability Correlation / Statistical Independence

    Optimization Global versus local optima Gradient descent Method of Lagrange Multipliers

    Brief recap of main algorithms in class

  • APPLIED MACHINE LEARNING

    10

    - To understand the basics of some key algorithms of Machine

    Learning

    - To apply some of these algorithms with real data and, by so doing, to understand the limitations of the algorithm for real-time systems

    - To raise in you enough interest for the field, so that you will later try to learn more about it (advanced class at the doctoral school, search on-line, …)

    - To have more engineers apply these techniques for robust control, signal processing, prediction, learning, etc.

    Class Objectives

  • APPLIED MACHINE LEARNING

    11

    Main learning outcomes: By the end of the course, the student must be able to: • Choose an appropriate ML method • Assess / Evaluate an appropriate ML method • Apply an appropriate ML method Transversal skills • Write a scientific or technical report. • Make an oral presentation.

    Learning Outcomes

  • APPLIED MACHINE LEARNING

    12

    Today’s class format

    • Taxonomy and basic concepts in ML

    • Examples of ML applications

    • Introduction to best practice in ML

  • APPLIED MACHINE LEARNING

    13

    What is Machine Learning?

    Machine Learning encompasses a large set of algorithms that aim at inferring information from what is hidden.

    This process is often referred to as Data Mining.

  • APPLIED MACHINE LEARNING

    14

    What is Machine Learning?

    Machine Learning encompasses a large set of algorithms that aim at inferring information from what is hidden.

    A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, "On separation of semitransparent dynamic images from static background", Proc. Intl. Conf. on Independent Component Analysis and Blind Signal Separation, pp. 934-940, 2006.

    Independent Component Analysis (ICA) can decompose mixture of signals

  • APPLIED MACHINE LEARNING

    15

    What is Machine Learning?

    Recognizing human speech. Here this is the wave produced when uttering the word “allright”.

    The strength of ML algorithms is that they can apply to arbitrary data. They can recognize patterns from various sources of data.

    Modeling time series: Hidden Markov Models be used to recognize complex sounds, including human speech.

  • APPLIED MACHINE LEARNING

    16

    What is Machine Learning?

    Piano note (C5 – do) Same note played by an oboe (hautbois)

    Classification: Two patterns that are different should still be grouped in the same class

  • APPLIED MACHINE LEARNING

    17

    What is Machine Learning?

    3 sample of 2-dimensional trajectories of a robot’s hand along time.

    Same trajectories displayed in space(without time).

    ML algorithms are often used to find the representation in which patterns that originally looked different, become similar.

  • APPLIED MACHINE LEARNING

    18

    What is Machine Learning?

    3 sample of 2-dimensional trajectories of a robot’s hand along time.

    Same trajectories displayed in space(without time).

    ML algorithms are often used to find the representation in which patterns that originally looked different, become similar.

    Principal Component Analysis can discover this projection First algorithm we will see in class

  • APPLIED MACHINE LEARNING

    19

    What is Machine Learning? Helps compute automatically information that would take days to do by hand.

    Noris, B., Nadel, J, Barker, M., Hadjikhani, N. and Billard, A. (2012) Investigating gaze of children with ASD in naturalistic settings. PLOS ONE.

    The mapping can be done through support vector regression An algorithm we will see in class

  • APPLIED MACHINE LEARNING

    20

    • 1940 – The Perceptron (Pitts & MacCulloch) !!

    • 1960 – The Perceptron (Minsky & Papert) !!

    • 1960 – “Bellman Curse of Dimensionality”

    • 1980 – Bounds on statistical estimators (C. Stone)

    • 1990 – Beginning of high dimensional data (Hundreds variables)

    • 2000 – High dimensional data (Thousands variables)

    Machine Learning, History

  • APPLIED MACHINE LEARNING

    21

    The history traces back to the very first step in Artificial

    Intelligence (AI). Machine Learning is a new way of thinking in AI that

    builds strongly on statistics: - probabilistic approach to modeling decision function - modeling of uncertainty in input, output and in the

    parameters of the model! - data-driven approach as opposed to “knowledge-

    driven”

    Machine Learning, History

  • APPLIED MACHINE LEARNING

    22

    1980 – The First Machine Learning Workshop was held at Carnegie-Mellon University in Pittsburgh.

    1980 – Three consecutive issues of the International Journal of Policy Analysis and Information Systems were specially devoted to machine learning.

    1981 - Hinton, Jordan, Sejnowski, Rumelhart, McLeland at UCSD Back Propagation alg. PDP Book

    1986 – The establishment of the Machine Learning journal. 1987 – The beginning of annual international conferences on machine learning

    (ICML). Snowbird ML conference 1988 – The beginning of regular workshops on computational learning theory

    (COLT). 1990’s – Explosive growth in the field of data mining, which involves the

    application of machine learning techniques.

    Machine Learning, History

  • APPLIED MACHINE LEARNING

    23

    Why and when do we need learning in Robotics?

  • APPLIED MACHINE LEARNING

    24

    Peg and Hole Problem

    A typical problem of Robotics

  • APPLIED MACHINE LEARNING

    25

    A typical problem of Robotics

    Peg and Hole Problem

  • APPLIED MACHINE LEARNING

    26

    A: Engineer the environment

    A typical problem of Robotics

  • APPLIED MACHINE LEARNING

    27

    A: Engineer the environment

    B: Engineer the body

    A typical problem of Robotics

  • APPLIED MACHINE LEARNING

    28

    A: Engineer the environment

    B: Engineer the body

    C: Engineer the controller

    Systematic search

    Adaptive control

    Learning Machine!

    A typical problem of Robotics

  • APPLIED MACHINE LEARNING

    29

    Machine Learning: definitions

    Machine Learning is the field of scientific study that concentrates on induction algorithms and on other algorithms that can be said to ``learn.'' Machine Learning Journal, Kluwer Academic

    Machine Learning is an area of artificial intelligence involving developing techniques to allow computers to “learn”. More specifically, machine learning is a method for creating computer programs by the analysis of data sets, rather than the intuition of engineers. Machine learning overlaps heavily with statistics, since both fields study the analysis of data. Webster Dictionary

    Machine learning is a branch of statistics and computer science, which studies algorithms and architectures that learn from data sets. WordIQ

  • APPLIED MACHINE LEARNING

    30

    To engineer the environment is not always desirable

    Rather, it is desirable to have a system that is

    Adaptable to different environments

    That can generalize across tasks

    Kronander, Burdet and Billard, Learning PegI n Hole Insertionf rom Human Demonstrations, 2013

  • APPLIED MACHINE LEARNING

    31

    To engineer the environment is not always desirable

    Rather, it is desirable to have a system that is

    adaptable to different environments

    can generalize across tasks

    Machines that learn

  • APPLIED MACHINE LEARNING

    34

    Problem I

    A: Engineer the environment

    B: Engineer the body

    C: Engineer the controller

    ROWS 4-6

    Make an autonomous robot car that can drive people from

    Lausanne to Geneva

    ROWS 1-3

    Design an autonomous robot that distributes graded assignments

    to a class of students

  • APPLIED MACHINE LEARNING

    35

    Generalization versus memorization

    An important feature of a learning system

    that differentiates it from a pure “memory” is its ability to generalize.

    Key features for a good learning system

  • APPLIED MACHINE LEARNING

    36

    Extracting features and generalizing

    Calinon, D’halluin & Billard, Handling of multiple constraints and motion alternatives in a robot programming by demonstration framework, Humanoids’09.

  • APPLIED MACHINE LEARNING

    37

    Learning versus Memorization

    Learning implies generalizing. Generalizing consists of extracting key features from the data, matching

    those across data (to find resemblances) and storing a generalized representation of the data features that accounts best (according to a given metric) for all the small differences across data.

    Generalizing is the opposite of memorizing and often one might want to

    find a tradeoff between over-generalizing, hence losing information on the data, and over fitting, i.e. keeping more information than required.

    Generalization is particularly important in order to reduce the influence

    of noise, introduced in the variability of the data.

  • APPLIED MACHINE LEARNING

    38

    Imagine that you have been a witness to a robbery.

    How would you describe the thief to the police?

    Problem II

  • APPLIED MACHINE LEARNING

    39

    Feature extraction consists of extracting only a subset, the

    key components, of a situation, image, scene, conversation, etc.

    Dimensionality reduction The key components depends on the context. Feature extraction presupposes the existence of a

    mechanism and of background knowledge to restore the complete situation, image, scene, conversation.

    Features extraction

  • APPLIED MACHINE LEARNING

    40

    Features extraction Bio-Inspiration

    Extracting the relevant information Preattentive processing of orientation

    Sensitive to asymmetries

    Nothdurft, H.-C. (1991a). Ophthalmology and Visual Science, 32(4), 714.

  • APPLIED MACHINE LEARNING

    41

    Features extraction Bio-Inspiration

    Our eyes pick up rapidly what stands out: asymmetries or uncoordinated pattern of motion

  • APPLIED MACHINE LEARNING

    42

    Features extraction Bio-Inspiration

    A Model of Saliency-Based Visual Attention for Rapid Scene Analysis. Itti et al 2008

  • APPLIED MACHINE LEARNING

    44

    Features extraction Bio-Inspired IT Systems

    Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub , Ruesch et al, ICRA 2008

    http://www.youtube.com/watch?v=AWZxvE5g5Pw&feature=player_detailpage

  • APPLIED MACHINE LEARNING

    45

    Problem III

    You must solve one of the two problems below.

    You are provided with the following data:

    Age, gender, nationality, body weight, hair color, favorite hobbies, weekly timetable of all EPFL students and EPFL professors

    Which data do you remove from the input to your system?

    ROWS 4-6

    Predict number of people in the metro between 8h00 and 8h15

    ROWS 1-3

    Predict number of meals sold by each EPFL cafeteria at lunch

  • APPLIED MACHINE LEARNING

    46

    Machine learning algorithms are often classified according to the type of computation they can perform on a given dataset. Common types of computation include: • Classification: learn to put instances into pre-defined classes

    • Association: learn relationships between the attributes

    • Clustering: discover classes of instances that belong together

    • Time series prediction: learn to predict a numeric quantity instead of a class

    Other terms: On-line Learning, Connectionist Models

    Taxonomy in ML

  • APPLIED MACHINE LEARNING

    47

    • Supervised learning – where the algorithm learns a function or model that maps a set of inputs to a set of desired outputs.

    • Unsupervised learning – where the algorithm learns a model that represents a set of inputs without any feedback (no desired output, no external reinforcement).

    • Reinforcement learning – where the algorithm learns a mechanism that generates a set of outputs from one input in order to maximize a reward value (external and delayed feedback).

    • Learning to learn – where the algorithm learns its own inductive bias based on previous experiences

    Taxonomy in ML

  • APPLIED MACHINE LEARNING

    48

    • Supervised learning relates to a vast group of methods by which one estimates a model from a set of examples,

    The system is given the desired output.

    • When these examples are provided by a human expert,

    this is referred to robot learning from demonstration; robot programming by demonstration.

    Supervised learning

  • APPLIED MACHINE LEARNING

    ( ){ }( )

    2Goal: estimate function of state evolution , ,

    Make observations of the state of the system , , 1... .

    Estimate through mean-square error: min

    i i

    i i

    i

    x f x x x

    N x x i N

    f f x x

    = ∈

    =

    −∑

    1x

    2x x*: target

    Supervised learning: example

  • APPLIED MACHINE LEARNING

    Supervised learning

    Khansari Zadeh, S. M., Kronander, K. and Billard, A. (2012) Learning to Play Minigolf. Advanced Robotics.

  • APPLIED MACHINE LEARNING

    51 Noris, B., Nadel, J, Barker, M., Hadjikhani, N. and Billard, A. (2012) Investigating gaze of children with ASD in naturalistic settings. PLOS ONE.

    Supervised learning

    Where do the eyes look?

    Map image of the eyes to point in the camera image

  • APPLIED MACHINE LEARNING

    52

    What is Machine Learning? What is sometimes impossible to see for humans is easy for ML to pick.

    Noris, B., Keller, J-B. and Billard, A. (2011) A Wearable Gaze Tracking System for Children in Unconstrained Environments. Computer Vision and Image Understanding.

    Exploit information not only on the pupil, cornea, but also on wrinkles, eyelids and eyelashed pattern to infer gaze direction.

    Support Vector Regression can be used to learn this mapping

  • APPLIED MACHINE LEARNING

    53

    What is Machine Learning?

    Noris, B., Keller, J-B. and Billard, A. (2011) A Wearable Gaze Tracking System for Children in Unconstrained Environments. Computer Vision and Image Understanding.

    Support Vector Regression can be used to learn this mapping

    20 20 , 1...50i xx i∈ =

    Input: 50 images of the eyes, In grey color 20x20 pixels

    Output: 50 images of the scene, In grey color 240x320 pixels

    ( )y f x= 240 320 , 1...50i xy i∈ =Learn a function f:

  • APPLIED MACHINE LEARNING

    54

    Unsupervised Learning

    Unsupervised learning refers to a variety of methods by which a pair of signals y and x are associated but there is no explicit labeling as to which y should be associated to which x. This is often done through association, i.e. through associative learning.

  • APPLIED MACHINE LEARNING

    55

    Associative Learning

    Let us play a little game.

    When I say “satelite”, what do you think of?

    When I say “EPFL”, what do you think of?

    When I say “exam”, what do you think of?

  • APPLIED MACHINE LEARNING

    56

    Associative Learning

    Associative learning is multi-modal: Visual, auditory, proprioceptive, etc. It can represent - A concatenation of features Satelite Comics, Beer - Events bearing a close temporality EPFL TSOL, Classes - A cause-to-effect relationship Exam Grade, Vacation…

  • APPLIED MACHINE LEARNING

    57

    Associative Learning for Learning Word-Objects Relations

    H. Kozima, H. Yano, A robot that learns to communicate with human caregivers, in: Proceedings of the International Workshop on Epigenetic Robotics, 2001

  • APPLIED MACHINE LEARNING

    58

    Reinforcement Learning (RL)

    RL tries to infer the optimal path to the goal,

    through a process of Trial-and-error, so as to maximize the reward.

    Reinforcement learning is a tedious learning

    method. It is slow and is functional only in well-defined

    problems with small search space.

  • APPLIED MACHINE LEARNING

    59

    Reinforcement Learning Robotics Applications

    Atkeson & Schaal, Learning how to swing an inverted pendulum, ICRA 1997.

    The robot tracks the position of the two colored ball. Using a model of the inverse pendulum dynamics, it learns which joint angle displacement to produce to ensures that the ball remains in equilibrium.

  • APPLIED MACHINE LEARNING

    60

    Reinforcement Learning Robotics Applications

    Kober, J.; Peters, J. (2011). Policy Search for Motor Primitives in Robotics, Machine Learning, 84, 1-2, pp.171-203.

    Learning the “ball in a cup” task; A couple of successful examples are provided by a human; the robot then performs guided trials so as to optimize the reward (putting the ball in the cup).

    http://www.youtube.com/watch?v=Fhb26WdqVuE

  • APPLIED MACHINE LEARNING

    62

    Problem IV

    Supervised learning, unsupervised learning or reinforcement learning?

    1: Learning to ride a bicycle 2: Predicting the length of the queue at the cafeteria 11h30-12h30 3: Relating demographics (age/weight/nationality) with chocolate consumption 4: Estimating monthly returns of stocks in the financial market

  • APPLIED MACHINE LEARNING

    63

    Summary

    Machine Learning encompasses a large area of works which cannot all be covered here. We will focus on a subset of algorithms that form the foundation of most current advances in machine learning. We however omit several topics, which are covered in other courses on machine learning at EPFL.

  • APPLIED MACHINE LEARNING

    64

    Some Machine Learning Resources

    On-line resources: • http://www.machinelearning.org/index.html • http://www.pascal-network.org/ Network of excellence on Pattern Recognition, Statistical Modelling and Computational Learning (summer schools and workshops) Journals: • Machine Learning Journal, Kluwer Publisher • IEEE Transactions on Signal processing • IEEE Transactions on Pattern Analysis • IEEE Transactions on Pattern Recognition • The Journal of Machine Learning Research

    Conferences: • ICML: int. conf. on machine learning • Neural Information Processing Conference – on-line repository of all research papers, www.nips.org

  • APPLIED MACHINE LEARNING

    65

    LASA

  • APPLIED MACHINE LEARNING

    66

    Software for the class

    �Applied Machine Learning��Introduction Slide Number 3Slide Number 4Slide Number 5Slide Number 6Slide Number 7Slide Number 8Slide Number 9Slide Number 10Slide Number 11Slide Number 12Slide Number 13Slide Number 14Slide Number 15Slide Number 16Slide Number 17Slide Number 18Slide Number 19Slide Number 20Slide Number 21Slide Number 22Slide Number 23Slide Number 24Slide Number 25Slide Number 26Slide Number 27Slide Number 28Slide Number 29Slide Number 30Slide Number 31Slide Number 34Slide Number 35Slide Number 36Slide Number 37Slide Number 38Slide Number 39Slide Number 40Slide Number 41Slide Number 42Slide Number 44Slide Number 45Slide Number 46Slide Number 47Slide Number 48Slide Number 49Slide Number 50Slide Number 51Slide Number 52Slide Number 53Slide Number 54Slide Number 55Slide Number 56Slide Number 57Slide Number 58Slide Number 59Slide Number 60Slide Number 62Slide Number 63Slide Number 64Slide Number 65Slide Number 66