a quick look at the “reinforcement learning”...

45
MVA-RL Course A Quick Look at the “Reinforcement Learning” course A. LAZARIC (SequeL Team @INRIA-Lille ) ENS Cachan - Master 2 MVA SequeL – INRIA Lille

Upload: others

Post on 12-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

MVA-RL Course

A Quick Look at the“Reinforcement Learning” course

A. LAZARIC (SequeL Team @INRIA-Lille)ENS Cachan - Master 2 MVA

SequeL – INRIA Lille

Page 2: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 2/16

Page 3: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 3/16

Page 4: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous robotics

I Elder careI Exploration of

unknown/dangerousenvironments

I Robotics for entertainment

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 4/16

Page 5: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Elder care

I Exploration ofunknown/dangerousenvironments

I Robotics for entertainment

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 4/16

Page 6: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Elder careI Exploration of

unknown/dangerousenvironments

I Robotics for entertainment

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 4/16

Page 7: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Elder careI Exploration of

unknown/dangerousenvironments

I Robotics for entertainment

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 4/16

Page 8: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applications

I Trading execution algorithmsI Portfolio managementI Option pricing

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 5/16

Page 9: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applications

I Trading execution algorithms

I Portfolio managementI Option pricing

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 5/16

Page 10: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applications

I Trading execution algorithmsI Portfolio management

I Option pricing

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 5/16

Page 11: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applications

I Trading execution algorithmsI Portfolio managementI Option pricing

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 5/16

Page 12: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy management

I Energy grid integrationI Maintenance schedulingI Energy market regulationI Energy production

management

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 6/16

Page 13: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy management

I Energy grid integration

I Maintenance schedulingI Energy market regulationI Energy production

management

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 6/16

Page 14: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy management

I Energy grid integrationI Maintenance scheduling

I Energy market regulationI Energy production

management

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 6/16

Page 15: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy management

I Energy grid integrationI Maintenance schedulingI Energy market regulation

I Energy productionmanagement

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 6/16

Page 16: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy management

I Energy grid integrationI Maintenance schedulingI Energy market regulationI Energy production

management

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 6/16

Page 17: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy managementI Recommender systems

I Web advertisingI Product recommendationI Date matching

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 7/16

Page 18: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy managementI Recommender systems

I Web advertising

I Product recommendationI Date matching

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 7/16

Page 19: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy managementI Recommender systems

I Web advertisingI Product recommendation

I Date matching

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 7/16

Page 20: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy managementI Recommender systems

I Web advertisingI Product recommendationI Date matching

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 7/16

Page 21: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy managementI Recommender systemsI Social applications

I Bike sharing optimizationI Election campaignI ER service optimizationI Intelligent Tutoring Systems

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 8/16

Page 22: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy managementI Recommender systemsI Social applications I Bike sharing optimization

I Election campaignI ER service optimizationI Intelligent Tutoring Systems

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 8/16

Page 23: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy managementI Recommender systemsI Social applications I Bike sharing optimization

I Election campaign

I ER service optimizationI Intelligent Tutoring Systems

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 8/16

Page 24: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy managementI Recommender systemsI Social applications I Bike sharing optimization

I Election campaignI ER service optimization

I Intelligent Tutoring Systems

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 8/16

Page 25: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Why: Important Problems

I Autonomous roboticsI Financial applicationsI Energy managementI Recommender systemsI Social applications I Bike sharing optimization

I Election campaignI ER service optimizationI Intelligent Tutoring Systems

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 8/16

Page 26: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

What

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 9/16

Page 27: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

What: Decision-Making under Uncertainty

Agent

Environment

state /actuationaction /

perception

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 10/16

Page 28: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

How: Reinforcement Learning

Reinforcement learning is learning what to do – how tomap situations to actions – so as to maximize a

numerical reward signal in an unknown uncertainenvironment. The learner is not told which actions to

take, as in most forms of machine learning, but she mustdiscover which actions yield the most reward by tryingthem (trial–and–error). In the most interesting and

challenging cases, actions may affect not only theimmediate reward but also the next situation and,

through that, all subsequent rewards (delayed reward).

“An introduction to reinforcement learning”,Sutton and Barto (1998).

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 11/16

Page 29: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

How: the Course

Agent

Environment

state /actuationaction /

perception

Formal and rigorous approach tothe RL’s way to decision-making under uncertainty

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 12/16

Page 30: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

How: the Course

Agent

Environment

state /actuationaction /

perception

Formal and rigorous approach tothe RL’s way to decision-making under uncertainty

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 12/16

Page 31: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

How: the Course

Agent

Environment

state /actuationaction /

perception

Formal and rigorous approach tothe RL’s way to decision-making under uncertainty

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 12/16

Page 32: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

How: the Course

Agent

Environment

state /actuationaction /

perception

Formal and rigorous approach tothe RL’s way to decision-making under uncertainty

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 12/16

Page 33: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

How: the Course

Agent

Environment

state /actuationaction /

perception

Formal and rigorous approach tothe RL’s way to decision-making under uncertainty

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 12/16

Page 34: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

How: the Course

Agent

Environment

state /actuationaction /

perception

Formal and rigorous approach tothe RL’s way to decision-making under uncertainty

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 12/16

Page 35: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

How: the Course

Agent

Environment

state /actuationaction /

perception

Formal and rigorous approach tothe RL’s way to decision-making under uncertainty

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 12/16

Page 36: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

How: the Course

Agent

Environment

state /actuationaction /

perception

Formal and rigorous approach tothe RL’s way to decision-making under uncertainty

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 12/16

Page 37: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

What: the Highlights of the CourseHow to model an RL problem

I What: Markov decision processI Tools: probability, processes, Markov chain

How to solve exactly an RL problem

How to solve incrementally an RL problem

How to efficiently explore in an RL problem

How to solve approximately an RL problem

With examples from resource optimization, trade execution,(computer) games, recommendation systems.

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 13/16

Page 38: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

What: the Highlights of the CourseHow to model an RL problem

How to solve exactly an RL problemI What: Dynamic programmingI Tools: fixed point, operators

How to solve incrementally an RL problem

How to efficiently explore in an RL problem

How to solve approximately an RL problem

With examples from resource optimization, trade execution,(computer) games, recommendation systems.

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 13/16

Page 39: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

What: the Highlights of the CourseHow to model an RL problem

How to solve exactly an RL problem

How to solve incrementally an RL problemI What: temporal difference, Q-learningI Tools: stochastic approximation

How to efficiently explore in an RL problem

How to solve approximately an RL problem

With examples from resource optimization, trade execution,(computer) games, recommendation systems.

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 13/16

Page 40: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

What: the Highlights of the CourseHow to model an RL problem

How to solve exactly an RL problem

How to solve incrementally an RL problem

How to efficiently explore in an RL problemI What: multi-armed bandit problemI Tools: concentration inequalities

How to solve approximately an RL problem

With examples from resource optimization, trade execution,(computer) games, recommendation systems.

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 13/16

Page 41: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

What: the Highlights of the CourseHow to model an RL problem

How to solve exactly an RL problem

How to solve incrementally an RL problem

How to efficiently explore in an RL problem

How to solve approximately an RL problemI What: approximate dynamic programmingI Tools: statistical learning theory

With examples from resource optimization, trade execution,(computer) games, recommendation systems.

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 13/16

Page 42: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

What: the Highlights of the Course

How to model an RL problem

How to solve exactly an RL problem

How to solve incrementally an RL problem

How to efficiently explore in an RL problem

How to solve approximately an RL problem

With examples from resource optimization, trade execution,(computer) games, recommendation systems.

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 13/16

Page 43: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

When/What/Where

I 7 lectures

I 4 practical sessions (and homework) [1 point each]

I 1 final project (report and oral presentation) [16 points]

Opportunities for spring internship and Ph.D. positions.

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 14/16

Page 44: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

When/What/Whereresearchers.lille.inria.fr/˜lazaric/Webpage/Teaching.html

Date Topic Classroom29/09 Intro/MDP Conference06/10 Dynamic Programming Condorcet13/10 RL Algorithms Condorcet20/10 TP on DP and RL Condorcet27/10 Multi-arm Bandit (1) Condorcet03/11 TP on Bandit Amphi Curie10/11 Multi-arm Bandit (2) [projects] Amphi Curie17/11 TP on Bandit Condorcet24/11 Approximate DP Condorcet01/12 TP on ADP Condorcet08/12 Sample Complexity of ADP Condorcet15/12 Guest lecture (TBD)

mid-Jan Evaluation (TBD)

Lectures are from 11am to 1pm, TP from 11am to 1:15pm.

A. LAZARIC – Introduction to Reinforcement Learning Fall 2015 - 15/16

Page 45: A Quick Look at the “Reinforcement Learning” courseresearchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course15_files/... · A Quick Look at the “Reinforcement Learning” course

Reinforcement Learning

Alessandro [email protected]

sequel.lille.inria.fr