Introduction to RL andLiquid Labs Case Study
Jorge Davila-Chacon
CSMLS Meetup - October 15, 2015 - Hamburg, Germany
Data Scientist Research Associate Liquid Labs University of Hamburg
20 mins
Introduction to RL
10 mins
Overview of Architecture
15 mins
Implementation with iPython Notebook
10 mins
Analysis of Results
5 mins
Wrap Up
2
Who?
● Curious people!
● Math background?
● Academia, industry?
● Semantics > Syntactics
3
What?● Learn by experience
● Learn with rewards
● Learn continuously
RL
5
RL Cycle
Figure 1. From “Introduction to Reinforcement Learning”, Sutton and Barto (1998).
https://webdocs.cs.ualberta.ca/~sutton/book/ebook/the-book.html6
How?● Markov Decision Process
● Bellman’s equation
● Policy iteration
7
Markov Decision Process
8
Bellman Equation
9
Bellman Equation
10
Policy Iteration
11
Policy Iteration
● SARSA
● Q-Learning
12
Policy Iteration
● SARSA
● Q-Learning
13
● Exploration
● Exploitation
Why?
With love…
● From: Monte Carlo
● To: TD Learning
● Eligibility Traces
14
Eligibility Traces
● SARSA
● Q-Learning
16
Eligibility Traces
17
18
RL
Case Study:Simulation before deployment
iPython Notebook
● Colorado example
● Architecture
● Implementation
19
Graphs!
● Long run results
● Short run results
● Future Work
20
Long Run - Without Lambda
21
Long Run - With Lambda
22
Long Run - With Lambda
23
Long Run - With Lambda
24
Short Run - With Lambda
25
Short Run - With Lambda
26
Short Run - With Lambda
27
Future Work - Non-Monotonic Epsilon
28
Wrap Up
● Learn from scratch
● Adaptive learning
● On-line learning
● Research possibilities!
29
Thank you for coming!
Jorge Davila-Chacon
[email protected]
or LinkedIn
30