what non -volatile memorypavlo/slides/nvm-may2016.pdf · @andy_pavlo what non -volatile memory...

41
@ANDY_PAVLO WHAT NON-VOLATILE MEMORY MEANS FOR THE FUTURE OF DATABASE SYSTEMS

Upload: lynga

Post on 28-Jul-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

@ANDY_PAVLO

W H AT NON-VOLATILE MEMORY

M E A N S F O R T H E F U T U R E O F

DATABASE SYSTEMS

1973

1974

1978

198 6

1994

2 010

2 016

The Future

Non-Volatile Memory

1 1

• Persistent storage with byte-addressable operations. • Fast read/write latencies. • No difference between random vs. sequential access.

What does NVM mean for DBMSs?

1 2

• Thinking of NVM as just a faster SSD is not interesting. • We want to use NVM as permanent storage for the

database, but this has major implications. –Operating System Support –Cloud Provider Provisioning –Database Management System Architectures

Existing Systems

NVM-Only Storage

Hybrid DBMS

1 3

Existing Systems

NVM-Only Storage

Hybrid DBMS

Chapter I – Existing Systems

1 4

• Investigate how existing systems perform with NVM for write-heavy transaction processing (OLTP) workloads.

• Evaluate two types of DBMS architectures. –Disk-oriented (MySQL) – In-Memory (H-Store)

A PROLEGOMENON ON OLTP DATABASE SYSTEMS FOR NON-VOLATILE MEMORY ADMS@VLDB 2015

1 5

DISK-ORIENTED Buffer Pool

Table Heap

Log Snapshots

IN-MEMORY Table Heap

Log Snapshots

Intel Labs NVM Emulator

1 6

• Instrumented motherboard that slows down access to the memory controller with tunable latencies.

• Special assembly to emulate upcoming Xeon instructions for flushing cache lines.

STORE STORE L1 Cache

L2 Cache

PCOMMIT

Experimental Evaluation

1 7

• Compare architectures on Intel Labs NVM emulator. • Yahoo! Cloud Serving Benchmark:

– 10 million records (~10GB) –8x database / memory –Variable skew

YCSB //

1 8

0

50,000

100,000

150,000

200,000MySQL H-Store

Read-Only Workload 2x Latency Relative to DRAM

SKEW AMOUNT (HIGH→LOW) TXN/SEC

YCSB //

1 9

0

10,000

20,000

30,000

40,000

TXN/SEC

MySQL H-Store

50% Reads / 50% Writes Workload 2x Latency Relative to DRAM

SKEW AMOUNT (HIGH→LOW)

8x Latency

LESS

ONS

2 0

Logging is a major performance bottleneck.

2

1 NVM Latency does not have a large impact.

Legacy DBMSs are not prepared to run on NVM.

3

What would Larry Ellison do?

Chapter II – NVM-only Storage • Evaluate storage and recovery methods for a system

that only has NVM. • Testbed DBMS with a pluggable storage engines. • We had to build our own NVM-aware memory allocator.

LET'S TALK ABOUT STORAGE & RECOVERY METHODS FOR NON-VOLATILE MEMORY DATABASE SYSTEMS SIGMOD 2015

2 3

Copy-on-Write Table Heap No Logging

Log-Structured No Table Heap

Log-only Storage

DBMS Architectures

2 4

In-Place Table Heap

Log + Snapshots

Copy-on-Write Table Heap No Logging

Log-Structured No Table Heap

Log-only Storage

In-Place Engine

2 5

Table Heap

Log Snapshots 1

2

3

UPDATE table SET val=ABC WHERE id=123

Delta Record

New Tuple

New Tuple

NVM

NVM-Optimized Architectures

2 6

• Use non-volatile pointers to only record what changed rather than how it changed.

• Be careful about how & when things get flushed from CPU caches to NVM.

NVM-Aware In-Place Engine

2 7

Table Heap

Log 1

2

Tuple Pointers

New Tuple Log Record

TxnId Pointer

UPDATE table SET val=ABC WHERE id=123

Evaluation

2 8

• Testbed system using the Intel NVM hardware emulator. • Yahoo! Cloud Serving Benchmark

–2 million records + 1 million transactions –High-skew setting

YCSB //

2 9

0

400,000

800,000

1,200,000

In-Place Copy-on-Write Log-Structured

NVM-Optimized Traditional

10% Reads / 90% Writes Workload 2x Latency Relative to DRAM

↑63%

↑122% ↑50%

TXN/SEC

YCSB //

3 0

0

100

200

300

400

In-Place Copy-on-Write Log-Structured

NVM-Optimized Traditional

10% Reads / 90% Writes Workload 2x Latency Relative to DRAM

NVM STORES (M)

↓40%

↓25%

↓20%

YCSB //

3 1

NVM-Optimized Traditional

Elapsed time to replay log with varying log sizes 2x Latency Relative to DRAM

RECOVERY TIME (MS)

0.01

0.1

1

10

100

1000

10^3 10^4 10^5 10^3 10^4 10^5 10^3 10^4 10^5 In-Place Copy-on-Write Log-Structured

No Recovery Needed

LESS

ONS

3 2

Avoid block-oriented components.

2

1 Using NVM correctly improves throughput & reduces weadown.

NVM-only systems are 15-20 years away

3

What would Nikita Kahn do?

Chapter III – Hybrid DBMS • Design and build a new in-memory DBMS that will be

ready for NVM when it becomes available. • Hybrid Storage + Hybrid Workloads

–DRAM + NVM oriented architecture –Fast Transactions + Real-time Analytics

3 5

Adaptive Storage

3 6

Original Data Adapted Data

SELECT AVG(B) FROM myTable WHERE C < “yyy”

UPDATE myTable SET A = 123, B = 456, C = 789 WHERE D = “xxx”

A B C D

BRIDGING THE ARCHIPELAGO BETWEEN ROW-STORES AND COLUMN-STORES FOR HYBRID WORKLOADS SIGMOD 2016

A B C D

Cold

Hot

A B C D

LESS

ONS

3 8

Peloton

The Self-Driving Database Management System

3 9

NVM Ready Query Compilation Vectorized Execution Autonomous Apache Licensed

Anthony Tomasic

Todd Mowry

Prashanth Menon

Michael Zhang

Lin Ma

Matthew Perron

Dana Van Aken

Yingjun Wu

Ran Xian

Runshen Zhu

Jiexi Lin

Jianhong Li

Ziqi Wang

http://pelotondb.org

Joy Arulraj

@ANDY_PAVLO