what non -volatile memorypavlo/slides/nvm-may2016.pdf · @andy_pavlo what non -volatile memory...
TRANSCRIPT
@ANDY_PAVLO
W H AT NON-VOLATILE MEMORY
M E A N S F O R T H E F U T U R E O F
DATABASE SYSTEMS
Non-Volatile Memory
1 1
• Persistent storage with byte-addressable operations. • Fast read/write latencies. • No difference between random vs. sequential access.
What does NVM mean for DBMSs?
1 2
• Thinking of NVM as just a faster SSD is not interesting. • We want to use NVM as permanent storage for the
database, but this has major implications. –Operating System Support –Cloud Provider Provisioning –Database Management System Architectures
Existing Systems
NVM-Only Storage
Hybrid DBMS
1 3
Existing Systems
NVM-Only Storage
Hybrid DBMS
Chapter I – Existing Systems
1 4
• Investigate how existing systems perform with NVM for write-heavy transaction processing (OLTP) workloads.
• Evaluate two types of DBMS architectures. –Disk-oriented (MySQL) – In-Memory (H-Store)
A PROLEGOMENON ON OLTP DATABASE SYSTEMS FOR NON-VOLATILE MEMORY ADMS@VLDB 2015
1 5
DISK-ORIENTED Buffer Pool
Table Heap
Log Snapshots
IN-MEMORY Table Heap
Log Snapshots
Intel Labs NVM Emulator
1 6
• Instrumented motherboard that slows down access to the memory controller with tunable latencies.
• Special assembly to emulate upcoming Xeon instructions for flushing cache lines.
STORE STORE L1 Cache
L2 Cache
PCOMMIT
Experimental Evaluation
1 7
• Compare architectures on Intel Labs NVM emulator. • Yahoo! Cloud Serving Benchmark:
– 10 million records (~10GB) –8x database / memory –Variable skew
YCSB //
1 8
0
50,000
100,000
150,000
200,000MySQL H-Store
Read-Only Workload 2x Latency Relative to DRAM
SKEW AMOUNT (HIGH→LOW) TXN/SEC
YCSB //
1 9
0
10,000
20,000
30,000
40,000
TXN/SEC
MySQL H-Store
50% Reads / 50% Writes Workload 2x Latency Relative to DRAM
SKEW AMOUNT (HIGH→LOW)
8x Latency
LESS
ONS
2 0
Logging is a major performance bottleneck.
2
1 NVM Latency does not have a large impact.
Legacy DBMSs are not prepared to run on NVM.
3
Chapter II – NVM-only Storage • Evaluate storage and recovery methods for a system
that only has NVM. • Testbed DBMS with a pluggable storage engines. • We had to build our own NVM-aware memory allocator.
LET'S TALK ABOUT STORAGE & RECOVERY METHODS FOR NON-VOLATILE MEMORY DATABASE SYSTEMS SIGMOD 2015
2 3
Copy-on-Write Table Heap No Logging
Log-Structured No Table Heap
Log-only Storage
DBMS Architectures
2 4
In-Place Table Heap
Log + Snapshots
Copy-on-Write Table Heap No Logging
Log-Structured No Table Heap
Log-only Storage
In-Place Engine
2 5
Table Heap
Log Snapshots 1
2
3
UPDATE table SET val=ABC WHERE id=123
Delta Record
New Tuple
New Tuple
NVM
NVM-Optimized Architectures
2 6
• Use non-volatile pointers to only record what changed rather than how it changed.
• Be careful about how & when things get flushed from CPU caches to NVM.
NVM-Aware In-Place Engine
2 7
Table Heap
Log 1
2
Tuple Pointers
New Tuple Log Record
TxnId Pointer
UPDATE table SET val=ABC WHERE id=123
Evaluation
2 8
• Testbed system using the Intel NVM hardware emulator. • Yahoo! Cloud Serving Benchmark
–2 million records + 1 million transactions –High-skew setting
YCSB //
2 9
0
400,000
800,000
1,200,000
In-Place Copy-on-Write Log-Structured
NVM-Optimized Traditional
10% Reads / 90% Writes Workload 2x Latency Relative to DRAM
↑63%
↑122% ↑50%
TXN/SEC
YCSB //
3 0
0
100
200
300
400
In-Place Copy-on-Write Log-Structured
NVM-Optimized Traditional
10% Reads / 90% Writes Workload 2x Latency Relative to DRAM
NVM STORES (M)
↓40%
↓25%
↓20%
YCSB //
3 1
NVM-Optimized Traditional
Elapsed time to replay log with varying log sizes 2x Latency Relative to DRAM
RECOVERY TIME (MS)
0.01
0.1
1
10
100
1000
10^3 10^4 10^5 10^3 10^4 10^5 10^3 10^4 10^5 In-Place Copy-on-Write Log-Structured
No Recovery Needed
LESS
ONS
3 2
Avoid block-oriented components.
2
1 Using NVM correctly improves throughput & reduces weadown.
NVM-only systems are 15-20 years away
3
Chapter III – Hybrid DBMS • Design and build a new in-memory DBMS that will be
ready for NVM when it becomes available. • Hybrid Storage + Hybrid Workloads
–DRAM + NVM oriented architecture –Fast Transactions + Real-time Analytics
3 5
Adaptive Storage
3 6
Original Data Adapted Data
SELECT AVG(B) FROM myTable WHERE C < “yyy”
UPDATE myTable SET A = 123, B = 456, C = 789 WHERE D = “xxx”
A B C D
BRIDGING THE ARCHIPELAGO BETWEEN ROW-STORES AND COLUMN-STORES FOR HYBRID WORKLOADS SIGMOD 2016
A B C D
Cold
Hot
A B C D
Anthony Tomasic
Todd Mowry
Prashanth Menon
Michael Zhang
Lin Ma
Matthew Perron
Dana Van Aken
Yingjun Wu
Ran Xian
Runshen Zhu
Jiexi Lin
Jianhong Li
Ziqi Wang
http://pelotondb.org
Joy Arulraj