self-driving databases & my pregnant wifepavlo/slides/selfdriving-tour2019.pdf · online...

94
Self-Driving Databases & My Pregnant Wife: The Hard Parts @andy_pavlo

Upload: others

Post on 22-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving Databases &My Pregnant Wife:The Hard Parts

@andy_pavlo

Page 2: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master
Page 3: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

I never took sex-ed.Everything I know is from a 1993 episode of Perfect Strangers.

I never took a ML course.Everything I know is from YouTube, research papers, and blogs.

Page 4: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

I never took sex-ed.Everything I know is from a 1993 episode of Perfect Strangers.

I never took a ML course.Everything I know is from YouTube, research papers, and blogs.

Page 5: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Autonomous Database Systems

Remove the burden of managing DBMSs from humans.

Target Tasks:Physical Database Design

Knob Configuration Tuning

Query Plan Tuning

Capacity Planning / Hardware Provisioning

4

Page 6: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Autonomous Database Systems

1970s: Self-Adaptive Systems

1990s: Self-Tuning Systems

2010s: Learned Components

5

Page 7: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Autonomous Database Systems

1970s: Self-Adaptive Systems

1990s: Self-Tuning Systems

2010s: Learned Components

5

Page 8: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Autonomous Database Systems

1970s: Self-Adaptive Systems

1990s: Self-Tuning Systems

2010s: Learned Components

5

Page 9: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Autonomous Database Systems

1970s: Self-Adaptive Systems

1990s: Self-Tuning Systems

2010s: Learned Components

5

Page 10: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

6

LocationKey

Neural Net

LearnedIndexes

B+Tree

Key

Location

Page 11: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

6

Transaction

LocationKey

Neural Net

LearnedIndexes

LearnedScheduling

B+Tree

Key

Location

PartitioningScheme

Page 12: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

6

Transaction

LocationKey

Neural Net

LearnedIndexes

LearnedScheduling

Un

supe

rvis

edC

lust

erin

g

B+Tree

Key

Location

Page 13: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

6

Transaction

LocationKey

Neural Net

LearnedIndexes

LearnedScheduling

Un

supe

rvis

edC

lust

erin

g

B+Tree

Key

Location

Page 14: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

6

Transaction SELECT *FROM A, B, CWHERE A.id = B.idAND A.id = C.id

A⨝ A.id=B.id

B C

A.id=C.id ⨝Histograms

Sketches

Neural NetLocationKey

Neural Net

LearnedIndexes

LearnedScheduling

LearnedPlanning

Un

supe

rvis

edC

lust

erin

g

B+Tree

Key

Location

Page 15: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

6

Transaction SELECT *FROM A, B, CWHERE A.id = B.idAND A.id = C.id

A⨝ A.id=B.id

B C

A.id=C.id ⨝Histograms

Sketches

Neural NetLocationKey

Neural Net

LearnedIndexes

LearnedScheduling

LearnedPlanning

Un

supe

rvis

edC

lust

erin

g

B+Tree

Key

Location

Page 16: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS

A system that configures, manages, and optimizes itself for the target database and its workload without any human input beyond the initialization parameters.

Objective Function (throughput, latency, cost, energy)

Constraints (SLOs, resources, cost)

The system is not allowed to make any change that requires a human to make a value judgement.

7

Page 17: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Environment

Objective

Constraints

Action

RewardState

Agent

Page 18: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS Tenets

➊ Explore unknown configurations.

➋ Generalize over environments / applications / hardware.

➌ Reason about delayed consequences.

9

Page 19: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

The Hard Parts

State Modeling

Training Data Collection

Reward Observation

10

Page 20: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

1. STATE MODELINGSuccinctly & accurately represent the environment state.

Page 21: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master
Page 22: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

StochasticCannot easily look inside to collect more data.

Non-StationaryPreferences and habits change from one day to the next.

EpisodicPregnancy is over after nine months.

Page 23: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

StochasticCannot exactly model the DBMS's internals.

Non-StationaryReward of an action can change over time.

Non-EpisodicDBMS must run forever. No terminating states.

Page 24: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Markov Decision Process Models

Not possible to know all possible DBMS states or transition probabilities ahead of time.

Encode entire history of the DBMS in its state vector:Database Contents

Database Physical Design

Workload

Knob Configuration

Hardware Resources

15

Page 25: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Markov Decision Process Models

Not possible to know all possible DBMS states or transition probabilities ahead of time.

Encode entire history of the DBMS in its state vector:Database Contents

Database Physical Design

Workload

Knob Configuration

Hardware Resources

15

Page 26: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

16

CREATE TABLE xxx (col1 INT PRIMARY KEY,col2 LONG NOT NULL,col3 INT DEFAULT NULL,col4 VARCHAR(8) NOT NULL

);

INSERT INTO xxxSELECT x.id,

99,(RANDOM()*1000)::int,SUBSTRING(MD5(id::text),1,1))

FROM generate_series(1, 1000) AS x (id);

Page 27: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

16

1000,4,CREATE TABLE xxx (

col1 INT PRIMARY KEY,col2 LONG NOT NULL,col3 INT DEFAULT NULL,col4 VARCHAR(8) NOT NULL

);

INSERT INTO xxxSELECT x.id,

99,(RANDOM()*1000)::int,SUBSTRING(MD5(id::text),1,1))

FROM generate_series(1, 1000) AS x (id);

Max Size

Avg Size

Min Val

Max Val

Cardinality

⮱2,8,8,99,99,1,

⮱1,4,4,0,997,630,

⮱3,8,1,'0','f',16,

⮱1,4,4,1,1000,1000,

# of Tuples # of Attributes

Type

Table State

Page 28: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

16

1000,4,CREATE TABLE xxx (

col1 INT PRIMARY KEY,col2 LONG NOT NULL,col3 INT DEFAULT NULL,col4 VARCHAR(8) NOT NULL

);Max Size

Avg Size

Min Val

Max Val

Cardinality

⮱2,8,8,99,99,1,

⮱1,4,4,0,997,630,

⮱3,8,1,'0','f',16,

⮱1,4,4,1,1000,1000,

# of Tuples # of Attributes

Type

Table State

Add/Drop Indexes?

Change Hardware?

Alter Physical Layout?

Tune Knobs?

Page 29: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

16

1000,4,CREATE TABLE xxx (

col1 INT PRIMARY KEY,col2 LONG NOT NULL,col3 INT DEFAULT NULL,col4 VARCHAR(8) NOT NULL

);Max Size

Avg Size

Min Val

Max Val

Cardinality

⮱2,8,8,99,99,1,

⮱1,4,4,0,997,630,

⮱3,8,1,'0','f',16,

⮱1,4,4,1,1000,1000,

# of Tuples # of Attributes

Type

Table StateALTER TABLE xxx ADD COLUMN col5 INT;

Add/Drop Indexes?

Change Hardware?

Alter Physical Layout?

Tune Knobs?

Page 30: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

16

1000,4,CREATE TABLE xxx (

col1 INT PRIMARY KEY,col2 LONG NOT NULL,col3 INT DEFAULT NULL,col4 VARCHAR(8) NOT NULL

);

⮱2,8,8,99,99,1,

⮱1,4,4,0,997,630,

⮱3,8,1,'0','f',16,

⮱1,4,4,1,1000,1000,

# of Tuples # of Attributes

Table StateALTER TABLE xxx ADD COLUMN col5 INT;

Add/Drop Indexes?

Change Hardware?

Alter Physical Layout?

Tune Knobs?

Page 31: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

17

SELECT x1.col1, COUNT(*)FROM xxx AS x1JOIN xxx AS x2ON x1.col1 > x2.col3

GROUP BY x1.col1;

Add Index

Agent

Page 32: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

17

SELECT x1.col1, COUNT(*)FROM xxx AS x1JOIN xxx AS x2ON x1.col1 > x2.col3

GROUP BY x1.col1;

Expected: 100

Estimated State

Add Index

Agent

Actual States

Page 33: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

17

SELECT x1.col1, COUNT(*)FROM xxx AS x1JOIN xxx AS x2ON x1.col1 > x2.col3

GROUP BY x1.col1;

Expected: 100

Actual: -10

Estimated State

Add Index

Agent

Actual States

Page 34: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

17

SELECT x1.col1, COUNT(*)FROM xxx AS x1JOIN xxx AS x2ON x1.col1 > x2.col3

GROUP BY x1.col1;

Expected: 100

Actual: -10

Estimated State

Add Index

Agent

Actual States

Page 35: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

17

SELECT x1.col1, COUNT(*)FROM xxx AS x1JOIN xxx AS x2ON x1.col1 > x2.col3

GROUP BY x1.col1;

Expected: 100

Actual: -10

Estimated State

Add Index

Agent

Actual States

ANALYZE;

Page 36: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Database Contents State

17

SELECT x1.col1, COUNT(*)FROM xxx AS x1JOIN xxx AS x2ON x1.col1 > x2.col3

GROUP BY x1.col1;

Expected: 100

Actual: -10

Estimated State

Add Index

Agent

Actual States

ANALYZE;

Page 37: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Knob Configuration State

18

Knob Configurations

16, 2, 4,…# Exec Threads

Query Cache Size

Parallel Query Threads

# of Background Threads

20GB, 8GB, 500MB,…Buffer Pool Size Join Buffer Size

Page 38: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Knob Configuration State

18

Knob Configurations

16, 2, 4,…# Exec Threads

Query Cache Size

Parallel Query Threads

# of Background Threads

20GB, 8GB, 500MB,…Buffer Pool Size Join Buffer Size32GB

Page 39: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Knob Configuration State

18

Knob Configurations

16, 2, 4,…# Exec Threads

Query Cache Size

Parallel Query Threads

# of Background Threads

20GB, 8GB, 500MB,…Buffer Pool Size Join Buffer Size32GB 128GB

Page 40: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Knob Configuration State

18

Knob Configurations

16, 2, 4,…# Exec Threads

Query Cache Size

Parallel Query Threads

# of Background Threads

20GB, 8GB, 500MB,…Buffer Pool Size Join Buffer Size32GB 128GB

Page 41: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Knob Configuration State

18

Knob Configurations

16, 2, 4,…# Exec Threads

Query Cache Size

Parallel Query Threads

# of Background Threads

20GB, 8GB, 500MB,…Buffer Pool Size Join Buffer Size

CREATE BUFFERPOOL new_bpBLOCKSIZE 64PAGESIZE 8192;

64, 8192,…

32GB 128GB

Page 42: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Knob Configuration State

18

Knob Configurations

16, 2, 4,…# Exec Threads

Query Cache Size

Parallel Query Threads

# of Background Threads

20GB, 8GB, 500MB,…Buffer Pool Size Join Buffer Size

CREATE BUFFERPOOL new_bpBLOCKSIZE 64PAGESIZE 8192;

CREATE BUFFERPOOL new_bp2 BLOCKSIZE 128PAGESIZE 4096;

64, 8192,…128, 4096,…

32GB 128GB

Page 43: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Knob Configuration State

18

Knob Configurations

16, 2, 4,…# Exec Threads

Query Cache Size

Parallel Query Threads

# of Background Threads

20GB, 8GB, 500MB,…Buffer Pool Size Join Buffer Size

CREATE BUFFERPOOL new_bpBLOCKSIZE 64PAGESIZE 8192;

CREATE BUFFERPOOL new_bp2 BLOCKSIZE 128PAGESIZE 4096;

64, 8192,…128, 4096,…

32GB 128GB

Page 44: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Potential Solutions

➊ Reduce model dimensionality to avoid variability.Examples: Hashing Kernels, Auto-Encoders, PCA

These exacerbate instability and destroy information.

➋ Store hardware resource knobs as percentages relative to the available amount.

➌ Use hierarchical models that isolate DBMS components to reduce the propagation of changes.

19

Page 45: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

2. DATA COLLECTIONObserve the system under varying scenarios to gain experience.

Page 46: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Training Data AcquisitionCan only make a limited number of observations. Need

Page 47: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Training Data AcquisitionCan only make a limited number of observations. Need

Page 48: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Training Data AcquisitionCan only make a limited number of observations. Need

Page 49: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Offline AcquisitionTargeted experiments on individual components.

Online AcquisitionCollect metrics and data from the system running in production.

Page 50: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Offline Training Data

➊ End-to-End BenchmarksRun sample workloads using the full DBMS.

23

Logging

Storage

Execution

Planning MetricsSQL

Agent

BenchmarkDriver

Page 51: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Offline Training Data

➊ End-to-End BenchmarksRun sample workloads using the full DBMS.

➋ Component TrainersRun microbenchmarks on a subset of components.

23

Storage

MetricsSQL

Agent

BenchmarkDriver

Microbench

API

Page 52: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Offline Training Data

➊ End-to-End BenchmarksRun sample workloads using the full DBMS.

➋ Component TrainersRun microbenchmarks on a subset of components.

23

MetricsSQL

Agent

BenchmarkDriver

Microbench

APIStorage v2

Page 53: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Offline Collection Logging

24

0

5

10

15

20

100 10,000 1,000,000

End-to-End (YCSB)

La

ten

cy (

ms )

Log Buffer Size (bytes)

0

5

10

15

20

100 10,000 1,000,000

Microbenchmark

Log Buffer Size (bytes)

40,000 Observations 1,000 Observations

Page 54: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

Collect data from the DBMS running in production.

We must be more conservative about action exploration.Cannot allow for any perceivable effect to applications / users.

No DBMS can fully support this today (e.g., mandatory restarts).

25

Page 55: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

26

Master Replica

Replica

Actions

Actions

Agent

Page 56: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

26

Master Replica

Replica

Actions Actions

Actions

Agent

Page 57: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

26

Master Replica

Replica

Failover Time?

Actions

Actions

Agent

Page 58: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

26

Master Replica

Replica

Failover Time?

Actions

Actions

Agent

Slower

Faster

Page 59: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

26

MasterReadsWrites

Writes

Writes

Replica

Replica

App Server

Agent

Page 60: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

26

MasterReadsWrites

Physical Log?

SQL Statements?

Replica

ReplicaPhysical Log?

SQL Statements?

App Server

Agent

Page 61: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

26

MasterReadsWrites

Physical Log?

SQL Statements?

Reads

Reads

Replica

ReplicaPhysical Log?

SQL Statements?

App Server

Agent

Page 62: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

26

MasterReadsWrites

Physical Log?

SQL Statements?

Replica

ReplicaPhysical Log?

SQL Statements?

App Server

Agent

Reads?

Reads?

Page 63: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Online Training Data

26

Master Replica

Replica

App Server

Actions

Agent

Page 64: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Alternative: Imitation Learning

Observe an expert tuning the DBMS and then try to train the system to mimic those decisions.

Still need to capture the state of the DBMS before each change to extrapolate why the DBA made the change.

Unlikely to work well because training data will be sparse/noisy.

27

Page 65: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

3. REWARD OBSERVATIONHow to identify the reward of a deployed action.

Page 66: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Short-Term RewardsWe immediately know when something is wrong.

Long-Term RewardsDifficult to determine whether we are doing the correct things now for our child in the future.

Page 67: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Short-Term RewardsWe immediately know when something is wrong.

Long-Term RewardsDifficult to determine whether we are doing the correct things now for our child in the future.

Page 68: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Short-Term RewardsWe immediately know when something is wrong.

Long-Term RewardsDifficult to determine whether we are doing the correct things now for our child in the future.

Page 69: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Short-Term RewardsWe immediately know when something is wrong.

Long-Term RewardsDifficult to determine whether we are doing the correct things now for our child in the future.

Page 70: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Short-Term RewardsWe immediately know when something is wrong.

Long-Term RewardsDifficult to determine whether we are doing the correct things now for our child in the future.

Page 71: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Short-Term RewardsMust immediately detect whether the DBMS violates constraints.

Long-Term RewardsIdentify long-term workload trends and how long it will take to prepare accordingly.

Page 72: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Deployment Costs

The DBMS must predict how much time and resources it will take to deploy each action.

Assume that the system only deploys one action at a time.

The cost of deploying actions can affect the objective function.

31

Compress Table

Agent

Variable Action Resources

Page 73: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Hardware Degradations

Transient hardware problems are difficult to detect and could mask the true reward of an action.

Likely more problematic in distributed / shared-disk systems.

Non-Elegant Solutions:Run periodic microbenchmarks and compare with historical data.

Train classifiers on hardware stats (e.g., SMART).

Tweak exploration policy to reconsider recent actions.

32

Page 74: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Reward Magnitudes

The importance of changes in the objective function depends on the workload context.

May need multiple policies to prefer one workload type over another in mixed environments.

33

OLTP Workload

Latency:100ms → 50ms

OLAP Workload

Latency:500ms → 250ms

Page 75: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Reward Magnitudes

The importance of changes in the objective function depends on the workload context.

May need multiple policies to prefer one workload type over another in mixed environments.

33

Mixed Workload

Latency:100ms → 50ms

Latency:500ms → 250ms -50%

-50%

Page 76: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Reward Magnitudes

The importance of changes in the objective function depends on the workload context.

May need multiple policies to prefer one workload type over another in mixed environments.

33

Mixed Workload

Latency:100ms → 50ms -50%

Page 77: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Other Hard Parts

Page 78: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Other Hard Parts

➊ Action Selection Policies

➋ Transparency / Explainability

➌ Human Interaction

35

Page 79: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Weren't you building a DBMS at CMU?

Page 80: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS Project

→ Latch-free MVCC

→ Hybrid Storage (Row + Column)

→ LLVM Execution Engine

→ Latch-free Bw-Tree

→ Postgres Compatible

→ Open-Source (Apache)

Pelotonpelotondb.io

Page 81: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS Project

→ Latch-free MVCC

→ Hybrid Storage (Row + Column)

→ LLVM Execution Engine

→ Latch-free Bw-Tree

→ Postgres Compatible

→ Open-Source (Apache)

Pelotonpelotondb.io

Page 82: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS Project

→ Latch-free MVCC

→ Hybrid Storage (Row + Column)

→ LLVM Execution Engine

→ Latch-free Bw-Tree

→ Postgres Compatible

→ Open-Source (Apache)

Pelotonpelotondb.io

Page 83: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS Project

→ Latch-free MVCC

→ Hybrid Storage (Row + Column)

→ LLVM Execution Engine

→ Latch-free Bw-Tree

→ Postgres Compatible

→ Open-Source (Apache)

Pelotonpelotondb.io

Page 84: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS Project

→ Latch-free MVCC

→ Hybrid Storage (Row + Column)

→ LLVM Execution Engine

→ Latch-free Bw-Tree

→ Postgres Compatible

→ Open-Source (Apache)

Pelotonpelotondb.io

Page 85: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS Project

→ Latch-free MVCC

→ Hybrid Storage (Row + Column)

→ LLVM Execution Engine

→ Latch-free Bw-Tree

→ Postgres Compatible

→ Open-Source (Apache)

Pelotonpelotondb.io

Page 86: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS Project

→ Latch-free MVCC

→ Hybrid Storage (Row + Column)

→ LLVM Execution Engine

→ Latch-free Bw-Tree

→ Postgres Compatible

→ Open-Source (Apache)

Pelotonpelotondb.io

Page 87: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

New System (2019)

→ Training & Modeling as First-Class Concepts

→ Zero Restart Action Deployments

→ Latch-free MVCC

→ Native Apache Arrow Storage

→ MemSQL-style LLVM Execution Engine

→ Postgres Compatible

→ Open-Source (MIT)

40

Page 88: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

New System (2019)

→ Training & Modeling as First-Class Concepts

→ Zero Restart Action Deployments

→ Latch-free MVCC

→ Native Apache Arrow Storage

→ MemSQL-style LLVM Execution Engine

→ Postgres Compatible

→ Open-Source (MIT)

40

Page 89: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

New System (2019)

→ Training & Modeling as First-Class Concepts

→ Zero Restart Action Deployments

→ Latch-free MVCC

→ Native Apache Arrow Storage

→ MemSQL-style LLVM Execution Engine

→ Postgres Compatible

→ Open-Source (MIT)

40

Page 90: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

New System (2019)

→ Training & Modeling as First-Class Concepts

→ Zero Restart Action Deployments

→ Latch-free MVCC

→ Native Apache Arrow Storage

→ MemSQL-style LLVM Execution Engine

→ Postgres Compatible

→ Open-Source (MIT)

40

Page 91: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Self-Driving DBMS Gym

Building a simulator is just as hard as building a real DBMS.

No existing DBMS has the sufficient control APIs, metrics, and infrastructure to support rapid experimentation.

We are attempting to build the first DBMS "gym" to make it easier for ourselves and others to develop new ML methods for autonomous control.

41

Page 92: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

Final Thoughts

End-to-end reinforcement learning is unlikely to work.Will require an impossibly large amount of training data.

No existing way to model infinite action spaces.

There are engineering efforts that existing DBMSs should pursue now to support automation in the future.

42

Page 93: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master
Page 94: Self-Driving Databases & My Pregnant Wifepavlo/slides/selfdriving-tour2019.pdf · Online Training Data ... SQL Statements? App Server Agent Reads? Reads? Online Training Data 26 Master

44

END@andy_pavlo