nvme virtualization nd - snia · n-port id virtualization(npiv) is a t11 standard technology for...
TRANSCRIPT
2016 Storage Developer Conference. © IBM. All Rights Reserved.
NVMe virtualization ideas for Machines on Cloud
Sangeeth KeeriyadathIBM
@ksangeek
day2dayunix.sangeek.com
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Disclaimer
All of the contents of this presentation are based on my understanding of NVM Express® and my academic interests to explore the possibilities of virtualizing NVMe. The contents in this presentation do not necessarily represent IBM’s positions, strategies or opinions.
2
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Topics
r NVMe primerr Past storage virtualization learningsr Virtualization considerationsr NVMe virtualization approaches
rBlind Mode : SCSI to NVMe translationrVirtual Mode : pure NVMe virtual stackrPhysical Mode : SR-IOV
r NVMe over Fabrics virtualizationr Other deliberations
3
2016 Storage Developer Conference. © IBM. All Rights Reserved.
NVMe primer [1] [2]
NVM Express® (NVMe) is an inherently parallel and high-performinginterface and command set designed for non-volatile memory based storager The interface provides optimized command submission and
completion paths resulting in :r Lower latencyr Increased bandwidth
r Simple command setr Support for parallel operation ( no lock contention ) by supporting up
to 65,535 I/O Queues with up to 64K outstanding commands per I/O Queue
r Suitable for complementing the benefits of multi-core CPU systems application parallelism - Command Submission/Completion queue ( SQ/CQ ) bound to core, process or thread
4
2016 Storage Developer Conference. © IBM. All Rights Reserved.
NVMe SSD
NVMe view [3]
5
NVM Express Controller
Namespace A(NSID 1)
Namespace B(NSID 2)
Namespace C(NSID 3)
Controller ManagementAdmin Submission / Completion Queues
I/OSubmissionQueue A
I/OSubmissionQueue X
I/OSubmissionQueue Y
I/OCompletionQueue A
I/OCompletionQueue Z
2016 Storage Developer Conference. © IBM. All Rights Reserved.
NVMe benefitsr Storage stack reductionr Designed considering next generation NVM
based devicesr Bind interrupts to CPUsr Interconnect choices :
rNVMe over PCIerNVMe over Fabric (RDMA, Fibre Channel)
6
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Why virtualize ?
r Key component of cloud computingr Optimizes utilization of physical resourcesr Improves storage utilizationr Simplified storage management - provides a
simple and consistent interface to complex functions
r Easier high availability(HA) solutions
7
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Learnings : IBM PowerVM success storyIBM® PowerVM® provides the industrial-strength virtualization solution for IBM Power Systems™ servers and blades that run IBM AIX®, IBM i and Linux workloads.Virtual I/O Server(VIOS) facilitates the sharing of physical I/O resources among guest Virtual Machines(VM). [4]
r NPIV [5]N-Port ID Virtualization(NPIV) is a T11 standard technology for virtualization of Fibre Channel networks. Enables connection of multiple VMs to one physical port of a Fibre Channel adapter. VIOS facilitates storage adapter sharing.
r vSCSI [6]Using virtual SCSI, VM can share disk storage, tape or optical devices that are assigned to the VIOS. VIOS does the storage virtualization, performs SCSI emulation and acts as SCSI target. 8
2016 Storage Developer Conference. © IBM. All Rights Reserved.
PowerVM storage virtualization view
9
Virtual IO Server
Power Hypervisor
vSCSI VM
A B C D
E F HG
vFC VM
vFC_CvS_CvS_SvFC_S
G H H DG C
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Why virtualize NVMe ?
r Limited PCIe slots and need to share the benefits amongst VMs ( high VM density )
r Use cases :rServer side caching of VM datarVM boot disk
10
NVMe being highly scalable is well suited for various virtualization exploitations
2016 Storage Developer Conference. © IBM. All Rights Reserved.
NVMe virtualization approaches
r Implementing SCSI to NVMe translation layer on the Hypervisor. ( Blind Mode ).
r Pure virtual NVMe stack by distributing I/O queues amongst hosted VMs. ( Virtual Mode ).
r SR-IOV based NVMe controllers per virtual functions ( Physical mode ).
11
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Blind Mode : SCSI to NVMe translationr Motivation
r Lot of storage stack ecosystem built around SCSI architecture model, protocol and interfaces
r Preserve software infrastructure investmentsr Method
r Implement a “SCSI to NVMe translator” layer built logically below the operating system SCSI storage stack and above the NVM Express driver
r Detailed in “NVM Express: SCSI Translation Reference” document [7]r SCSI primary/block commands to NVMe command mappingr Common SCSI Field translations e.g. “PRODUCT IDENTIFICATION”
field would be translated to first 16 bytes of the Model Number (MN) field within the “Identify Controller Data”
r NVMe “Status Code” to SCSI “Status Code, Sense Key, Additional Sense Key” 12
2016 Storage Developer Conference. © IBM. All Rights Reserved.
OS Storage stack : SCSI to NVMe translated
13
NVMe SSD
File System
Logical Volume Manager
SCSI Block disk driver
SCSI to NVMe translator
NVM Express Driver
Application
Success Completion
Invalid Command Opcode
GOOD / NO SENSE
CHECK CONDITION / ILLEGAL REQUEST / INVALID COMMAND OPERATION CODE
Identify command
Read
INQUIRYREPORT LUNSREAD CAPACITY
READ(6), READ(10),READ(12), READ(16)
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Blind Mode View : NVMe unaware VM
14
VIOSVM A
NVMe SSD
NVMe SSD
NVMe SSD
VM B
r SCSI to NVMe translation performed on the storage hypervisorr Virtual Machine(VM)’s Operating System(OS) storage stack is
unmodifiedr Performance consideration : “storage hypervisor” could use a “hardware
accelerated SCSI to NVMe translator”
TranslatorvSCSI HOST
Block disk driver
SCSI-NVMe translator
NVMe driver
C HHC
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Model suitable for server side storage virtualization solutions and preserve NVMe benefits
Virtual Mode : pure NVMe virtual stack
15
r Motivation
r Share NVM storage capacity amongst Virtual Machines(VMs)r VMs with pure NVMe stack retain the various NVMe performance
characteristics :r Low latencyr Parallelism
r Easier on-demand scaling(grow/shrink) implementationr Suitable for future “NVMe over Fabrics” implementations in data-
centers
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Virtual Mode : pure NVMe virtual stack
16
r Method
To be disclosed during presentation
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Virtual Mode View : pure NVMe stack on VM
17
To be disclosed during presentation
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Physical Mode : SR-IOV [8]
18
r Motivationr Single Root I/O Virtualization (SR-IOV) is a specification that allows a PCIe
device to appear as multiple separate physical PCIe devicesr Inherent QoS capabilities and configuration
r Methodr Partition adapter capability logically into multiple separate PCI function
called “virtual functions”r Each “virtual function” could be individually assigned to a Virtual
Machine(VM)
2016 Storage Developer Conference. © IBM. All Rights Reserved.
NVM Express Controller
NVMe SR-IOV view
19NVMe SSD
Physical Function Virtual Function( VF A )
Virtual Function( VF B )
I/OQueues
AdminQueues
NVMe driver
Hypervisor
I/OQueues
AdminQueues
NVMe driver
VM A VM B
2016 Storage Developer Conference. © IBM. All Rights Reserved.
NVMe over Fabrics
r Extends NVMe onto data center fabrics such as Ethernet, Fibre Channel and InfiniBandTM
r Distance connectivity to storage systems with NVMe devicesr Scaling up of the NVMe devices in large solutions
r Two types of fabric transports for NVMe are currently under development: r NVMe over Fabrics using RDMA ( InfiniBand, RoCE and iWARP )
r RDMA verbsr NVMe over Fabrics using Fibre Channel (FC-NVMe)
r Backward compatible with Fibre Channel Protocol (FCP) transporting SCSI
20
2016 Storage Developer Conference. © IBM. All Rights Reserved.
NVMe over fabrics view
21
NVMe SSD
NVMe SSD
NVMe SSD
NS_X
NS_Y
NS_Z
Virtual Machine A InfinibandTM
RoCE
iWARP
FC
InfinibandTM
RoCE
FC
iWARP
Infinibandfabric
Ethernetfabric
Fibre Channel fabric
Server NVMe Storage box
Virtual Machine B
Virtual Machine C
2016 Storage Developer Conference. © IBM. All Rights Reserved.
NVMe over Fabrics virtualization
r NVMe over Fabrics using FC ( NPIV ) [9]
r NVMe over Fabrics using RDMA
22
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Other deliberations
r SSD endurance – Drive Writes Per Day (DWPD) r Virtual Machine migrationr Interrupt driven model or polling
23
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Referencesr [1] NVM ExpressTM Infrastructure - Exploring Data Center PCIe® Topologies :
http://www.nvmexpress.org/wp-content/uploads/NVMe_Infrastructure_final1.pdfr [2] A Comparison of NVMe and AHCI “Don H Walker” :
https://sata-io.org/system/files/member-downloads/NVMe%20and%20AHCI_%20_long_.pdfr [3] NVM Express Overview
http://www.nvmexpress.org/nvm-express-overview/r [4] IBM PowerVM Virtual I/O Server overview :
http://www.ibm.com/support/knowledgecenter/POWER8/p8hb1/p8hb1_vios_virtualioserveroverview.htmr [5] IBM PowerVM “Virtual Fibre Channel” :
http://www.ibm.com/support/knowledgecenter/POWER8/p8hb1/p8hat_vfc.htmr [6] IBM PowerVM “Virtual SCSI” :
http://www.ibm.com/support/knowledgecenter/POWER8/p8hb1/p8hb1_vios_concepts_stor.htmr [7] NVM Express: SCSI Translation Reference :
http://www.nvmexpress.org/wp-content/uploads/NVM-Express-SCSI-Translation-Reference-1_1-Gold.pdfr [8] I/O Virtualization in Enterprise SSDs :
http://www.snia.org/sites/default/files/SDC15_presentations/virt/ZhiminDing_IO_Virtualization_eSSD.pdfr [9] FC-NVMe NVMe over Fabrics “QLOGIC” :
http://www.qlogic.com/Resources/Documents/WhitePapers/Adapters/WP_FC-NVMe.pdf
24
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Credits [ IBM Systems ]
r Hemanta Dutta, AIX Storage & IO SWr Mallesh Lepakshaiah, Virtual IO Serverr Ninad Palsule, AIX Virtual Server Storager Sudhir Maddali, AIX Storage Device Driversr Venkata Anumula, AIX Storage Device Drivers
25
2016 Storage Developer Conference. © IBM. All Rights Reserved.
Thank You
26