![Page 1: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/1.jpg)
Fusion-io Confidential—Copyright © 2013 Fusion-io, Inc. All rights reserved. Copyright © 2013 Fusion-io, Inc. All rights reserved.
NVM Software Interfaces New Directions
Nisha Talagala
![Page 2: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/2.jpg)
Evolution of Flash Adoption
February 4, 2013 SNIA NVM Summit 2
F L A S H + D I S K
![Page 3: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/3.jpg)
Evolution of Flash Adoption
February 4, 2013 SNIA NVM Summit 3
F L A S H + D I S K
F L A S H A S D I S K
![Page 4: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/4.jpg)
Evolution of Flash Adoption
February 4, 2013 SNIA NVM Summit 4
F L A S H A S M E M O R Y
F L A S H + D I S K
F L A S H A S D I S K
![Page 5: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/5.jpg)
Evolution of Flash Adoption
February 4, 2013 SNIA NVM Summit 5
![Page 6: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/6.jpg)
Conventional I/O Access
February 4, 2013 SNIA NVM Summit 6
APPLICATION
Application source code
Simple Block
Network File
Simple Block
Traditional Storage
Proprietary Storage OS
Storage Media
Native Flash Translation Layer
Storage Media
Software Defined Storage
Conventional I/O access
![Page 7: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/7.jpg)
Direct-Access I/O through Native Interfaces
February 4, 2013 SNIA NVM Summit 7
APPLICATION
Application source code
Transactional Block
Native File
Key-Value Object
Simple Block
Network File
Simple Block
Traditional Storage
Proprietary Storage OS
Storage Media
Native Flash Translation Layer
Storage Media
Software Defined Storage
Conventional I/O access
Direct access I/O
![Page 8: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/8.jpg)
Memory-access through Native Interfaces
February 4, 2013 SNIA NVM Summit 8
APPLICATION
Application source code
High-speed Log
Checkpointed Memory
Auto-Commit
Memory™ Transactional Block
Native File
Key-Value Object
Simple Block
Network File
Simple Block
Traditional Storage
Proprietary Storage OS
Storage Media
Native Flash Translation Layer
Storage Media
Software Defined Storage
Conventional I/O access
Memory access
Direct access I/O
![Page 9: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/9.jpg)
New Primitives for a New Type of Media
February 4, 2013 SNIA NVM Summit 9
Open, read, write, rewind, close.
Open, read, write, seek, close. Open, read, write, seek, close. Open, read, write, seek, close. Plus, new primitives to exploit characteristics of non-volatile memory
Basic write + atomic write, conditional write. Basic write + TTL expiry for auto-deletion. Basic mmap + crash-safety, versioning.
Tape
Disk
ioMemory NVM
SSD
![Page 10: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/10.jpg)
ATOMIC I/O Primitives: Sample Uses and Benefits
February 4, 2013 SNIA NVM Summit 10
Databases Transactional Atomicity: Replace various workarounds implemented in database code to provide write atomicity (double-buffered writes, etc.) Filesystems File Update Atomicity: Replace various workarounds implemented in filesystem code to provide file/directory update atomicity (journaling, etc.)
▸ 99% performance of raw writes Smarter media now natively understands atomic updates, with no additional metadata overhead.
▸ 2x longer flash media life Atomic Writes increase the life of flash media up to 2x due to reduction in write-ahead-logging and double-write buffering.
▸ 50% less code in key modules Atomic operations dramatically reduce application logic, such as journaling, built as work-arounds.
![Page 11: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/11.jpg)
MySQL with Atomic Writes
Double-write disabled – Non-ACID
Double-write – ACID
XFS directFS with Atomic I/O
Twice the performance of ACID transactions
Atomic writes – ACID
Fusion-io Confidential 11
![Page 12: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/12.jpg)
Native File System Access
February 4, 2013 SNIA NVM Summit 12
▸ Leverages native flash capabilities for file system acceleration
▸ File services layer ▸ Consumes and uses native
primitives ▸ Exports primitives for use by
applications ▸ Applications can run on either
devices or filesystems
ioMemory™ with direct access I/O
ioMemory™ with memory semantics
Application Application
User-defined I/O API Libraries
User-defined Memory API Libraries
Direct-access I/O API Libraries
Memory Semantics API Libraries
directFS – NVM filesystem
directFS – NVM filesystem
I/O Primitives Memory Primitives
VSL™ VSL™
Read/Write Read/Write CPU Load/Store
Native Access
![Page 13: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/13.jpg)
directFS
February 4, 2013 SNIA NVM Summit 13
▸ Appears as any other file system in Linux ▸ Applications can use directFS file system unmodified
with performance benefits ▸ Focuses only on file namespace ▸ Employs virtualized flash storage layer’s logic for:
• Large virtualized addressed space • Direct flash access • Crash recovery mechanisms
▸ Exports Primitives through file namespace ▸ Applications can use primitives through directFS or directly to
device
![Page 14: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/14.jpg)
directFS: Native File Name Space for NVM
February 4, 2013 SNIA NVM Summit 14
Application
VFS (virtual file system) abstraction layer
ioMemory VSL– Dynamic provisioning, Block allocation, logging etc.
ext3 btrfs xfs directFS
Kernel block layer
kernel-space
user-space
![Page 15: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/15.jpg)
directFS – Eliminating duplicate logic
February 4, 2013 SNIA NVM Summit 15
Application
VFS (virtual file system) abstraction layer
ioMemory VSL block allocation, mapping, recycling
ACID updates, logging/journaling, crash-recovery,
directFS file metadata mgmt
Kernel block layer
Block Availability:
exists()
ACID Updates:
atomic write()
Block Grooming: PTRIM()
kernel-space
user-space
FS file metadata mgmt,
block allocation, mapping, recycling, ACID updates, logging/journaling, crash-recovery,
![Page 16: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/16.jpg)
Performance PARITY BETWEEN BLOCK AND DIRECTFS: micro-benchmarks
February 4, 2013 16
2 Thread
I/O Size
Seq Read, Block
Seq Read, directFS
Rand Read, Block
Rand Read, directFS
Seq Write, Block
Seq Write, directFS
Rand Write, Block
Rand Write, directFS
Ban
dwid
th (M
iB/s
)
![Page 17: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/17.jpg)
Performance PARITY BETWEEN BLOCK AND DIRECTFS: atomic writes
February 4, 2013 17
Ban
dwid
th (M
iB/s
)
I/O Size B
andw
idth
(MiB
/s)
I/O Size Block directFS
1 Thread 8 Threads
![Page 18: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/18.jpg)
MySQL on directFS with Atomic Writes
Double-write disabled – Non-ACID
Double-write – ACID
XFS directFS with Atomic I/O
Twice the performance of ACID transactions
Atomic writes – ACID
Fusion-io Confidential 18
![Page 19: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/19.jpg)
Key-value store API Library: Sample Uses and Benefits
February 4, 2013 SNIA NVM Summit 19
NoSQL Applications Reduce overprovisioning due to lack of coordination between two-layers of garbage collection (application-layer and flash-layer). Some top NoSQL applications recommend over-provisioning by 3x due to this. Reduce application I/Os through batched put and get operations. Increase performance by eliminating packing and unpacking blocks, defragmentation, and duplicate metadata at app layer.
▸ 95% performance of raw device Smarter media now natively understands a key-value I/O interface with lock-free updates, crash recovery, and no additional metadata overhead.
▸ Up to 3x capacity increase
Dramatically reduces over-provisioning with coordinated garbage collection and automated key expiry.
▸ 3x throughput on same SSD Early benchmarks comparing against memcached with BerkeleyDB persistence show up to 3x improvement.
![Page 20: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/20.jpg)
Key-Value Store API library Benchmarks
February 4, 2013 SNIA NVM Summit 20
![Page 21: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/21.jpg)
Range of memory-Access Semantics
February 4, 2013 SNIA NVM Summit 21
Extended Memory Volatile
Transparently extends DRAM onto flash, extending application virtual memory
Checkpointed Memory
Volatile with non-volatile checkpoints
Region of application virtual memory with ability to preserve snapshots to flash namespace
Auto-Commit Memory™ Non-volatile
Region of application memory automatically persisted to non-volatile memory and recoverable post-system failure
![Page 22: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/22.jpg)
OS Swap vs. Extended Memory
February 4, 2013 SNIA NVM Summit 22
Originally designed as a last resort to prevent OOM (out-of-memory) failures Never tuned for high-performance demand-paging Never tuned for multi-threaded apps Poor performance, ex. < 30 MB/sec throughput No application code changes required Designed to migrate hot pages to DRAM and cold pages to ioMemory Tuned to run natively on flash (leverages native characteristics) Tuned for multi-threaded apps 10-15x throughput improvement over standard OS Swap
Non-Volatile Storage (Disks, SSDs, etc.)
System Memory
OS SWAP Mechanism
ioMemory
System Memory
Extended Memory Mechanism
![Page 23: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/23.jpg)
Comparing I/O and Memory Access Semantics
February 4, 2013 SNIA NVM Summit 23
I/O I/O semantics examples:
• Open file descriptor – open(), read(), write(), seek(), close() • (New) Write multiple data blocks atomically, nvm_vectored_write() • (New) Open key-value store – nvm_kv_open(), kv_put(), kv_get(), kv_batch_*()
Memory Access
(Volatile)
Volatile memory semantics example: • Allocate virtual memory, e.g. malloc() • memcpy/pointer dereference writes (or reads) to memory address • (Improved) Page-faulting transparently loads data from NVM into memory
Memory Access
(Non-
Volatile)
Non-volatile memory semantics example: • (New) Allocate and map Auto-Commit Memory™ (ACM) virtual memory pages • memcpy/pointer dereference writes (or reads) to memory address • (New) Call checkpoint() to create application-consistent ACM page snapshots • (New) After system failure, remap ACM snapshot pages to recover memory state • (New) De-stage completed ACM pages to NVM namespace • (New) Remap and access ACM pages from NVM namespace at any time
![Page 24: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/24.jpg)
Application Use of Memory-Access Semantics
February 4, 2013 SNIA NVM Summit 24
Legacy SSDs ioMemory as ioDrive
ioMemory as Transparent Cache
ioMemory with direct access I/O
ioMemory with memory semantics
Appl
icat
ion
Application
Appl
icat
ion
Application Application Application Application
Open Source Extensions
Open Source Extensions
OS Block I/O OS Block I/O OS Block I/O
dire
ct I
/O
Prim
itive
s
dire
ct
Key-
Valu
e St
ore
API
dire
ct C
ache
AP
I
Exte
nded
M
emor
y
Chec
k-po
inte
d M
emor
y
Auto
-Com
mit
Mem
ory
Host
Host
File System File System File System directFS – native file
system service
directFS
VSL
Block Layer Block Layer Block Layer
SAS/SATA
Network VSL Virtual Storage
Layer
directCache VSL VSL
Rem
ote RAID Controller
VSL Flash Layer Read/Write Read/Write Read/Write Read/Write Read/Write Load/Store
1. Application source uses memory programming semantics, such as malloc(), free(), pointer ops, etc.
2. Stack can exhibit different properties ranging from purely volatile (DRAM extension), to intermediate points of persistence (checkpoints), to fine grained persistence (ACM)
3. Underlying technology can be block oriented or support direct CPU load/store operations
4. Integrates with existing storage namespaces
![Page 25: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/25.jpg)
Open Interfaces and Open Source
February 4, 2013 SNIA NVM Summit 25
• Primitives: Open Interface
• directFS: Open Source
• API Libraries: Open Source, Open Interface
• Application modifications: Open Source
• INCITS SCSI (T10) active standards proposals: ▸ SBC-4 SPC-5 Atomic-Write
http://www.t10.org/cgi-bin/ac.pl?t=d&f=11-229r6.pdf
▸ SBC-4 SPC-5 Scattered writes, optionally atomic http://www.t10.org/cgi-bin/ac.pl?t=d&f=12-086r3.pdf
▸ SBC-4 SPC-5 Gathered reads, optionally atomic http://www.t10.org/cgi-bin/ac.pl?t=d&f=12-087r3.pdf
• SNIA NVM-Programming TWG active member
![Page 26: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/26.jpg)
f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E
Q U E S T I O N S ?
![Page 27: NVM Software Interfaces New Directions - Home | Storage Networking](https://reader035.vdocuments.mx/reader035/viewer/2022071601/613d50a8736caf36b75bde49/html5/thumbnails/27.jpg)
f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E
T H A N K Y O U