linux foundation vault 2016 dan williams & tiffany kasanicky€¦ · imc imc m m m m m m m ch 0...
TRANSCRIPT
Linux Foundation Vault 2016Dan Williams & Tiffany Kasanicky
/dev/pmem0
/dev/ndblk0.0s
Managing Persistent Memory
SYSTEMADDRESSSPACE
E820ACPI.NFIT
SYSTEMMEMORY
BLKREGION
PERSISTENT MEMORY (PMEM)
PMEMREGION
PMEMREGION
LIBNVDIMMBUS
PMEM
IOCTLEXT4, XFS
SYSFS
LIBNVDIMMREGION
PMEM1SPMEM0 BLK1S
LIBNDCTLUSERSPACE
NDCTL
BTT(BLK)
FS FS
IXPDIMM(Management Stack)
Application(PMEM Aware)
Applications(Traditional)
DAX
BTT(PMEM)
Namespaces
# ndctl list --namespaces --type=pmem
{
"dev":"namespace6.0",
"mode":"raw",
"size":33554432,
"uuid":"70a6adce-722e-4ab8-b698-35eaea9750b3",
"blockdev":"pmem6"
}
Namespaces
# ndctl list --namespaces --type=pmem
{
"dev":"namespace6.0",
"mode":"raw",
"size":33554432,
"uuid":"70a6adce-722e-4ab8-b698-35eaea9750b3",
"blockdev":"pmem6"
}
Namespaces
# ndctl list --namespaces --type=pmem
{
"dev":"namespace6.0",
"mode":"raw",
"size":33554432,
"uuid":"70a6adce-722e-4ab8-b698-35eaea9750b3",
"blockdev":"pmem6"
} “Namespace”: Persistent memory capacityaccessed through a PMEM or BLK disk device
Namespaces
# ndctl list --namespaces --type=blk
{
"dev":"namespace0.0",
"mode":"sector",
"uuid":"5ce6c34a-88b0-469a-86f5-ea8f462a68ca",
"sector_size":4096,
"blockdev":"ndblk0.0s"
}
Why BLK?
RAW
Namespace Modes
SECTOR MEMORY
RAW
• Byte-addressable
• Limited DAX
Namespace Modes
SECTOR MEMORY
RAW
• Byte-addressable
• Limited DAX
Namespace Modes
SECTOR
• Software atomic sector update semantics
• Configurable sector size.
• Applicable to PMEM and BLK namespaces
MEMORY
RAW
• Byte-addressable
• Limited DAX
Namespace Modes
SECTOR
• Software atomic sector update semantics
• Configurable sector size.
• Applicable to PMEM and BLK namespaces
MEMORY
• Enables full DAX (DMA/RDMA/Direct-I/O)
• Only applicable to PMEM namespaces
“Memory” Mode DAX: Direct I/O
Application
DAX‘buf’
Disk file‘fd’
pmem0 sda
“Memory” Mode DAX: Direct I/O
Application
DAX‘buf’
Disk file‘fd’
Core Kernel
buf struct page
fd (bdev, sector)
bio
“Memory” Mode DAX: Direct I/O
Application
DAX‘buf’
Disk file‘fd’
Core Kernel
buf struct page
fd (bdev, sector)
Block Layer + Disk Driver
sda
bio
bio sgl
pmem0DMA
disk driver
“Memory” Mode DAX: Direct I/O
Application
DAX‘buf’
Disk file‘fd’
Core Kernel
buf struct page
fd (bdev, sector)
Block Layer + Disk Driver
sda
bio
bio sgl
pmem0DMA
disk driver
X
“Memory” Mode DAX: Considerations
• struct page array is 64-bytes per 4K page (16GB per 1TB)
# ndctl create-namespace --reconfig=namespace9.0 --mode=memory --map=dev --force
ndctl/libndctl < IXPDIMM
ndctl/libndctl: low level generic primitives
IXPDIMM: Coherent / comprehensive management stack
Tiffany [email protected]
Persistent Memory
23
NVDIMM
User
Kernel
BIOS
Management
ACPI (NFIT) _DSM
Management Tools
FW
File System
Application Application
MMU Mappings
SMBIOS
Standard File API
Standard Raw Device Access
Standard File API
Load/Store
NFIT/NVDIMM Driver
Persistent Memory
pmem-aware FS (DAX)
Application
Block I/O
Components
24
3
• Basic Management
End-user provisioning and management via CLI
• Enabling
SFCB/Pegasus CIM provider for remote access and 3rd party integration
C library for programmatic access and abstraction
• Monitoring
Daemon for health monitoring BIOS
User
Kernel
clicim
api
NFIT NVDIMM Driver
ACPI (NFIT) _DSM
monitor
Enterprise Tools
db
Syslog
core
libndctl
SMBIOS
Pegasus/SFCB
NVDIMM Management
25
CPU
iMC
iMC
DR
AM
DR
AM
DR
AM
NV
DIM
MD
RA
M
DR
AM
DR
AM
CH 0 CH 1 CH 2
CH 3 CH 4 CH 5
DR
AM
DR
AM
NV
DIM
M
NV
DIM
M
NV
DIM
M
Slot 1
Slot 0
Slot 1
Slot 0N
VD
IMM
NV
DIM
M
Instrumentation: FW update, SW versioning, data-at-rest security, FW settings and policies
Performance: Bytes read/written, host and block read/writes
Namespaces: Create/delete/inventory namespaces
Sensors & Settings: Thermal, wear, spare, power, errors
Discovery: DRAM + NVDIMM topology, identifying information, capabilities
Memory Configuration: Volatile and persistent partitions, interleave settings
Discovery
26
• DRAM/NVDIMM Topology
SMBIOS Type 17 (memory device) data
NVDIMM socket, memory controller, channel population
• Aggregated Memory Resources
• Capabilities
Platform BIOS, NVDIMM, FW, SW
• NVDIMM Information
Identifying - serial number, model number, device ID
Status – manageability, health, security
Provisioning – partitioning, attributes, state
Memory Provisioning
27
1. Create memory allocation goal
2. Reboot
3. BIOS writes NFIT
4. Driver reads NFIT
5. Create namespace
6. Mount file system
NVDIMMNVDIMM
NVDIMM
User
Kernel
BIOS _DSM
FW PMMetadata
MRC NFIT
1
Management Tools
NFIT/NVDIMM Driver
3
4
5
2
Diagnostics
28
• Quick Health Check
• Platform Configuration Check
• Security Check
• FW Consistency and Settings Check
• Persistent Memory Metadata Check
• Address Range Scrub Results
Packages
29
Component
Package Repository
cli ixpdimm-cli https://github.com/01org/IXPDIMMSW
cim libixpdimm-cim https://github.com/01org/IXPDIMMSW
core libixpdimm-core https://github.com/01org/IXPDIMMSW
api libixpdimm-apilibixpdimm-api-devel
https://github.com/01org/IXPDIMMSW
monitor ixpdimm-monitor https://github.com/01org/IXPDIMMSW
cli framework libintelnvm-clilibintelnvm-cli-devel
https://github.com/01org/intelnvmclilibrary
i18n framework libintelnvm-i18nlibintelnvm-i18n-devel
https://github.com/01org/intelnvmi18nlibrary
cim framework libintelnvm-cimlibintelnvm-cim-devel
https://github.com/01org/intelnvmcimlibrary
Distribution Plan
30
• Open source 3-clause BSD license
• Hosted on 01.org/github – Intel maintainers
https://01.org/ixpdimm-sw
https://01.org/intel-nvm-cim-library
https://01.org/intel-nvm-cli-library
https://01.org/intel-nvm-i18n-library
• Targeted OS Distributions:
RHEL/Fedora
SLES/OpenSuSE