peloton: the self-driving database management systempavlo/slides/selfdriving-may2016.pdf ·...

Post on 07-Oct-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Peloton The Self-Driving Database Management System

1920s

Cornelius Von Pavlo

1950s

Joseph Pavlo

1980s

Timothy Pavlo

According to the Bureau of Labor Statistics, Database Administrators earned an average salary of $81,710 in 2015.

[Source]

4

What We Can Automate

» Physical Database Design

» Data Placement

» Query Optimization & Tuning

» Knob Configuration

5

What We Cannot Automate

» Security & Access Controls

» Non-cloud Resource Provisioning

» Data Integration & Cleaning

» Out-of-band Interruptions

» Source Code Version Control

6

Why We Are Different

» Previous automated tools only dealt with handling problems in the past.

» Humans still make final decisions.

» We can identify cycles/patterns and react before problems occur.

7

8

In-Memory

HTAP Apache Licensed

Autonomous

Peloton

9

Transaction Manager

Execution Engine

Storage Manager

Logging & Snapshots

Better Concurrency

Lock-free Indexing

Adaptive Storage

Memcache Support

Query Compilation

NVM Optimizations

“The Brain”

Fall 2015

Spring 2016

Summer 2016

Adaptive Storage

» Change the layout of data overtime based on how it is accessed.

» Data is transformed from OLTP optimized format to OLAP optimized format.

» SIGMOD 2016

10

Adaptive Storage

11

Original Data

A B C D

Adapted Data

A B C D SELECT AVG(B) FROM myTable WHERE C < “yyy”

UPDATE myTable SET A = 123, B = 456, D = 789 WHERE C = “xxx”

Hot

Cold

A B C D

Adaptive Storage

12

0

500

1000

1500

0 1 2 3 4 5 6 7 8 9 10EX

EC

UT

ION

TIM

E (

S)

SEQUENCE

Row Store Column Store Adaptive

OLAP OLTP OLAP OLTP

NVM Optimizations

» Avoid the overhead of DBMS recovery

» Larger-than-Memory databases

» Write-Behind Logging

» Data Tiering

» WIP Spring 2016

13

Write-Behind Logging

14

In-Memory Heap

ver

x1

x2

A B C

x3

NVM Heap

A x1

x2

B C D

x3

NVM Log

UPDATE myTable SET A = 123, WHERE C = “xxx”

PCOMMIT

001:Txn1-0x0001

The Brain

» Component that controls all aspects of the DBMS to support autonomous operation.

» Lots of papers. This is going to take a while…

15

The Brain

» Part #1: Forecaster

» Part #2: Controller

» Part #3: Optimization Generator

» Part #4: Executor

16

Human vs. Machine

17

0

250

500

750

1000

Default MySQL Tuner DBA OtterTune

Th

rou

gh

pu

t (t

xn/s

ec

)

TPC-C Workload – MySQL (v5.7)

PDL Database News

18

19

Deduplication for Database Replication Lianhong Xu, et al. (SoCC 2015)

Hybrid OLTP In-Memory Indexes Huanchen Zhang, et al. (SIGMOD 2016)

Larger-than-Memory Databases Lin Ma, et al. (DaMON 2016)

Database Systems (15-721)

» Grad-level course on database internals.

» Heavy emphasis on complex systems software engineering.

» All projects completed in Peloton.

» http://15721.courses.cs.cmu.edu

20

PDL DB 2015-2016 Statistics

» # of PhD Students: 4

» # of Post-docs: 1

» # of MS Students: 6

» # of Undergrads: 0

» # of Students Stabbed: 0

21

22

Dana Van Aken 2016 National Science Foundation Fellowship

23

Joy Arulraj 2016 Samsung Fellowship

Anthony Tomasic

Todd Mowry

Joy Arulraj

Prashanth Menon

Michael Zhang

Lin Ma

Matthew Perron

Dana Van Aken

Yingjun Wu

Ran Xian

Runshen Zhu

Jiexi Lin

Jianhong Li

Ziqi Wang

Peloton http://pelotondb.org

top related