introduction to reinforcement learning and fintech case study

30
Introduction to RL and Liquid Labs Case Study Jorge Davila-Chacon CSMLS Meetup - October 15, 2015 - Hamburg, Germany Data Scientist Research Associate Liquid Labs University of Hamburg

Upload: jorge-davila-chacon

Post on 06-Apr-2017

378 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Introduction to RL andLiquid Labs Case Study

Jorge Davila-Chacon

CSMLS Meetup - October 15, 2015 - Hamburg, Germany

Data Scientist Research Associate Liquid Labs University of Hamburg

20 mins

Introduction to RL

10 mins

Overview of Architecture

15 mins

Implementation with iPython Notebook

10 mins

Analysis of Results

5 mins

Wrap Up

2

Who?

● Curious people!

● Math background?

● Academia, industry?

● Semantics > Syntactics

3

RL

Introduction

4

What?● Learn by experience

● Learn with rewards

● Learn continuously

RL

5

RL Cycle

Figure 1. From “Introduction to Reinforcement Learning”, Sutton and Barto (1998).

https://webdocs.cs.ualberta.ca/~sutton/book/ebook/the-book.html6

How?● Markov Decision Process

● Bellman’s equation

● Policy iteration

7

Markov Decision Process

8

Bellman Equation

9

Bellman Equation

10

Policy Iteration

11

Policy Iteration

● SARSA

● Q-Learning

12

Policy Iteration

● SARSA

● Q-Learning

13

● Exploration

● Exploitation

Why?

With love…

● From: Monte Carlo

● To: TD Learning

● Eligibility Traces

14

From MC to TD

15

Eligibility Traces

● SARSA

● Q-Learning

16

Eligibility Traces

17

18

RL

Case Study:Simulation before deployment

iPython Notebook

● Colorado example

● Architecture

● Implementation

19

Graphs!

● Long run results

● Short run results

● Future Work

20

Long Run - Without Lambda

21

Long Run - With Lambda

22

Long Run - With Lambda

23

Long Run - With Lambda

24

Short Run - With Lambda

25

Short Run - With Lambda

26

Short Run - With Lambda

27

Future Work - Non-Monotonic Epsilon

28

Wrap Up

● Learn from scratch

● Adaptive learning

● On-line learning

● Research possibilities!

29

Thank you for coming!

Jorge Davila-Chacon

[email protected]

or LinkedIn

30