architecting flash for application acceleration and memory

36
c Architecting Flash for Application Acceleration and Memory Efficiency Fellow, SanDisk August 5, 2014 Nisha Talagala

Upload: others

Post on 16-Apr-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Architecting Flash for Application Acceleration and Memory

1 c

Architecting Flash for Application Acceleration and Memory Efficiency

Fellow, SanDisk August 5, 2014

Nisha Talagala

Page 2: Architecting Flash for Application Acceleration and Memory

2

Forward-Looking Statements

During our meeting today we may make forward-looking statements.

Any statement that refers to expectations, projections or other characterizations of future events or circumstances is a forward-looking statement, including those relating to industry trends, future memory technology and product capabilities and performance. Information in this presentation may also include or be based upon information from third parties, which reflects their expectations and projections as of the date of issuance.

Actual results may differ materially from those expressed in these forward-looking statements due to factors detailed under the caption “Risk Factors” and elsewhere in the documents we file from time to time with the SEC, including our annual and quarterly reports.

We undertake no obligation to update these forward-looking statements, which speak only as of the date hereof or as of the date of issuance by a third party, as the case may be.

Page 3: Architecting Flash for Application Acceleration and Memory

3

Non Volatile Memory (NVM) Flash (NAND)

– 100s GB to 10 TB per device

– Trends - Higher capacity, lower cost/GB, lower write cycles, SLC->MLC->3BPC

– 100K to millions of IOPS, GB/s of bandwidth

PCM/MRAM/STT/Other NVMs

– Still in research

Page 4: Architecting Flash for Application Acceleration and Memory

4

Why Flash? Perfect match for

Databases

Low latency – good for low queue I/O

Handles mixed sequential and random I/O at various block sizes much better than disk

▸ Capacity ▸ IOPS ▸ Cost/IOP

4TB 3TB 15 200,000 $$$$ ¢¢¢¢

Page 5: Architecting Flash for Application Acceleration and Memory

5

Evolution of Flash Usage FLASH + DISK FLASH AS DISK FLASH BEYOND DISK FLASH AS MEMORY

Increased flash awareness

Better Transactions/$/Watt

Page 6: Architecting Flash for Application Acceleration and Memory

6

Flash Beyond Disk: Not Just Fast Disk Area Hard Disk Drives Flash Devices

Read/Write Performance Largely symmetrical Heavily asymmetrical. Additional operation (erase)

Sequential vs Random Performance 100x difference. Elevator scheduling for disk arm

<10x difference. No disk arm – NAND die

Mapping and Background ops Rare Regular occurrence – like a log structured file system

Wear out Largely unlimited Limited writes

IOPS 100s to 1,000s 100Ks to Millions

Latency 10s ms 10s-100s us

Page 7: Architecting Flash for Application Acceleration and Memory

7

Flash-Awareness: Opportunities in the I/O Stack

Flash devices –I/O and Primitives (Atomics, PTRIM, etc) - Memory, Persistent Memory

File System (XFS, Ext4, Btrfs, NVMFS)

Apps

APIs, Libraries

Transparent Optimization

Flash-Aware Optimization

Page 8: Architecting Flash for Application Acceleration and Memory

8

Examples covered in this talk - MySQL - MongoDB - LevelDB Gigaspaces with ZetaScale™ Software – See OPEN Session 301-A: Flash Changes the Game in Application Performance and TCO (Enterprise Applications Track)

Page 9: Architecting Flash for Application Acceleration and Memory

9

Example: MySQL

Page 10: Architecting Flash for Application Acceleration and Memory

10

Flash as Disk: Tune for a Really Fast Disk Has been going on now for

several years with great results

Targeted data placement, NOOP scheduler, seek-less optimizations, concurrency improvements, etc.

Make the block I/O stack faster

Find the fastest file-system

Page 11: Architecting Flash for Application Acceleration and Memory

11

Multi-Instance: Using all the IOPS

Instances 1 2 4

Thro

ughp

ut, N

OT/1

0sec

0

4000

6000

8000

10000

12000

Fusion-io, 48 threads, 2400W – 64GB BP

4810

8788

11952

2000

Page 12: Architecting Flash for Application Acceleration and Memory

12

Atomic Writes

Traditional MySQL Writes MySQL with Atomic Writes

Page C Page B

Page A

Buffer

DRAM Buffer

SSD (or HDD) Database

Database Server

Page C Page B Page A

Page C Page B Page A

Page C Page B Page A

Application initiates updates to pages A, B, and C.

1

MySQL copies updated pages to memory buffer.

2

MySQL writes to double-write buffer on the media.

3

Once step 3 is acknowledged, MySQL writes the updates to the actual tablespace.

4

ioMemory Database

Page C Page B Page A

DRAM Buffer

Page C Page B Page A

Application initiates updates to pages A, B, and C.

1

MySQL copies updated pages to memory buffer.

2

MySQL writes to actual tablespace, bypassing the double-write buffer step due to inherent atomicity guaranteed by the (intelligent) device.

3

Database Server

Page C Page B

Page A

Page 13: Architecting Flash for Application Acceleration and Memory

13

Performance with Atomic Writes

Double-write disabled – Non-ACID

Double-write – ACID

Without Atomic Writes With Atomic I/O

Twice the performance AND with ACID properties

Atomic writes – ACID

• Atomic writes at 99% of the performance of raw writes • 2x flash device endurance improvement

Page 14: Architecting Flash for Application Acceleration and Memory

14

0

20

40

60

80

100

120

140

160

180

200

111

122

133

144

155

166

177

188

199

111

0112

1113

2114

3115

4116

5117

6118

7119

8120

9122

0123

1124

2125

3126

4127

5128

6129

7130

8131

9133

0134

1135

21

Mill

isec

onds

Seconds

Sysbench 99% Latency OLTP workload

XFS DoubleWriteDirectFS Atomic

Atomic Writes: Latency Improvement 2-4x Latency Improvement on Percona Server

Atomic Writes

Page 15: Architecting Flash for Application Acceleration and Memory

15

NVM-Compression Uses natural thin-provisioning of the

underlying flash

No bit packing at MySQL, no rebalancing

Holes in data files are TRIMed/unmapped

Multi-threaded flush and atomics to reduce latency

Pluggable compression algorithms

ioMemory-VSL

NVMFS

MySQL

Page 16: Architecting Flash for Application Acceleration and Memory

16

Performance Improvement (LinkBench)

100%

20%

90%

0%

20%

40%

60%

80%

100%

Uncompressed Row Compression NVM-Compression

Transaction Rate Compression Performance Penalty

Page 17: Architecting Flash for Application Acceleration and Memory

17

Performance Improvement: (TPCC-like)

0

5000

10000

15000

20000

25000

30000

Tim

e13

026

039

052

065

078

091

010

4011

7013

0014

3015

6016

9018

2019

5020

8022

1023

4024

7026

0027

3028

6029

9031

2032

5033

8035

10

New Order TX

TPC-C like workload MariaDB 10

1,000 warehouses - 75GB DRAM

MySQL uncompressed

MySQL compression

Fusion-io Compression

Time

Page 18: Architecting Flash for Application Acceleration and Memory

18

Capacity Efficiency and Endurance

Row-comp Page-comp49.0% 58.5%

44.0%46.0%48.0%50.0%52.0%54.0%56.0%58.0%60.0%

% im

prov

emen

t

Vs. Uncompressed *

*For LinkBench with lz77. Comparable results with lz4.

• Architecturally more storage efficient than row compression

• When combined with

Atomic Writes, can improve flash endurance ~4x

Page 19: Architecting Flash for Application Acceleration and Memory

19

Flash Aware API Integration for Database

Page 20: Architecting Flash for Application Acceleration and Memory

20

Example: MongoDB

Page 21: Architecting Flash for Application Acceleration and Memory

21

Reducing DRAM with Flash MongoDB Throughput (operations/second)

0

2000

4000

6000

8000

24 GB Node 12 GB Node 8 GB Node

-26% -33%

2x DRAM Reduction 3x DRAM Reduction

Page 22: Architecting Flash for Application Acceleration and Memory

22

Improving Performance Predictability while reducing DRAM

12GB 2x DRAM Reduction

8GB 3x DRAM Reduction

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

Improved Predictability with Less DRAM

24G

12G

8G

24GB Baseline

Page 23: Architecting Flash for Application Acceleration and Memory

23

Improving Performance Predictability while reducing DRAM

Throughput Deviation (operations/second)

0

100

200

300

400

24 GB Node 12 GB Node 8 GB Node

2x DRAM Reduction 3x DRAM Reduction

3x Improvement

Page 24: Architecting Flash for Application Acceleration and Memory

24

Improving Performance Predictability with Less DRAM Latency Deviation (microsecond)

0

3

6

9

12

Update Latency Deviation Read Latency Deviation

2x DRAM Reduction 3x DRAM Reduction

Tighter consistency even with 3x less DRAM

Page 25: Architecting Flash for Application Acceleration and Memory

25

Example: LevelDB

Page 26: Architecting Flash for Application Acceleration and Memory

26

LevelDB, NVMKV, and Raw Block Device

Fusion-io flash aware stack can bring full performance of flash to KV stores

Page 27: Architecting Flash for Application Acceleration and Memory

27

Improving endurance and lifetime with Flash Awareness

Fusion-io Confidential

RW: Random Async Write RW-S: Random Sync Write SW: Sequential Async Write SW-S: Sequential Sync Write

Improving endurance by 2x-40x with flash aware key value storage

Page 28: Architecting Flash for Application Acceleration and Memory

28

Performance Predictability

Page 29: Architecting Flash for Application Acceleration and Memory

29

Transaction throughput (YCSB)

Throughput improvements of 50%-300% while using less DRAM

Page 30: Architecting Flash for Application Acceleration and Memory

30

Availability Atomic Writes

– MariaDB mainline >= 5.5.31 – Percona Server >= 5.5.31 – Oracle MySQL >= 5.7.4

NVM Compression – MariaDB – 10.0.9 – Percona Server – 5.6 – Oracle MySQL – labs release (http://labs.mysql.com/)

NVMFS available early access MongoDB – works with existing MongoDB releases NVMKV – available at opennvm.github.io

Page 31: Architecting Flash for Application Acceleration and Memory

31

https://opennvm.github.io

http://www.opencompute.org/projects/storage

Page 32: Architecting Flash for Application Acceleration and Memory

32

Thank You

Page 33: Architecting Flash for Application Acceleration and Memory

33

Appendix

Page 34: Architecting Flash for Application Acceleration and Memory

34

File System Support

POSIX Interface Functionality

fallocate(offset, len) Preallocation and extending files/table space

fallocate(PUNCH_HOLE) Unmap/Punch hole operation. (issue Persistent TRIMs)

io_submit() AIO Transparent Atomic writes

34

NVM-Compression uses POSIX compliant Interfaces

NVM-Compression speed from NVMFS file system

Page 35: Architecting Flash for Application Acceleration and Memory

35

NVMFS Non Volatile Memory FileSystem A POSIX compliant filesystem, designed by Fusion-io

Strengths

– Efficient pre-allocation of large files – Pre-allocation of large files is efficient – No fragmentation/degradation with aging of filesystem – Near raw flash speeds for I/O, atomic writes and TRIM

Page 36: Architecting Flash for Application Acceleration and Memory

36

©2014 SanDisk Corporation. All rights reserved. SanDisk is a trademark of SanDisk Corporation, registered in the United States and other countries. ZetaScale is a trademark of SanDisk Enterprise IP LLC. Fusion-IO, ioDrive and ioDrive Duo are trademarks of Fusion-IO, Inc. Other brand names mentioned herein are for identification purposes only and may be the trademarks of their holder(s).