isolating namespace and performance in key-value ssds for
TRANSCRIPT
DISCOSLABORATORY
Isolating Namespace and Performance in Key-Value SSDs forMulti-tenant Environments
Donghyun Min and Youngjae KimSogang University, South Korea
1
The 13th ACM Workshop on Hot Topics In Storage and File Systems (HotStorage’ 21, July 27-28)
Key-Value Store (KV-Store)
• Key-Value Store (KV-Store) is a type of NoSQL database.• KV-Store uses simple Key-Value (KV) interface to store/retrieve data.• Host-side KV-Store
• E.g., RocksDB, LevelDB, …
2
Key-Value Solid-State Drive (KVSSD)
• KVSSD runs storage engine of KV-Store on the SSD.
3
Key Value Application
Host KV-Store Storage Engine
File system
Block Layer & Block Device Driver
Block Device(SSD)
Program
Key-Value Solid-State Drive (KVSSD)
• KVSSD runs storage engine of KV-Store on the SSD.
4
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.
Key Value Application
Host KV-Store Storage Engine
File system
Block Layer & Block Device Driver
Block Device(SSD)
Program
Host file system overhead has been
reported[1].
Key-Value Solid-State Drive (KVSSD)
• KVSSD runs storage engine of KV-Store on the SSD.
5
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.
Key Value Application
Host KV-Store Storage Engine
File system
Block Layer & Block Device Driver
Block Device(SSD)
Program
KV Device(KVSSD)
KV Device Driver
High throughputLow latency, WAF, RAF
KV API Library
KV
Host file system overhead has been
reported[1].
Host file system overhead has been
reported[1].
Key-Value Solid-State Drive (KVSSD)
• KVSSD runs storage engine of KV-Store on the SSD.
6
KV Device(KVSSD)
KV Device Driver
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.
High throughputLow latency, WAF, RAF
KV API Library
Key Value Application
Host KV-Store Storage Engine
File system
Block Layer & Block Device Driver
Block Device(SSD)
Program
KV
<Log-Structured Merge-tree (LSM-tree) based KVSSD>
• KVSSD (DATE’ 18),• iLSM-SSD (MASCOTS’ 19),• PinK (USENIX ATC’ 20)
LSM-tree indexer(Append-manner)
Write-optimized
MemTable
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
7
DRAM
Flash
Lvl. 0 SSTable
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
Value Log
< offset >
MemTable
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
8
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable
Put (key k, value v)
Value Log
< offset >
MemTable
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
9
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable
Put (key k, value v)
Value Log
❶ Value Log Append
< offset >
MemTable
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
10
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable
❷ MemTableUpdate
< key k, value offset > Put (key k, value v)
Value Log
❶ Value Log Append
< offset >
MemTable
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
11
Put (key k, value v)
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable Value Log
< offset >
❶ Value Log Append
❷ MemTableUpdate
< key k, value offset >
❸ SSTableFlush BloomFilter
Meta(Key, Offset)
MemTable
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
12
Put (key k, value v)
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable Value Log
< offset >
❶ Value Log Append
❷ MemTableUpdate
< key k, value offset >
❸ SSTableFlush BloomFilter
Meta(Key, Offset)
Lvl. 1 SSTable
Lvl. 2 SSTable
SSTable
SSTable SSTable
MemTable
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
13
Put (key k, value v)
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable Value Log
< offset >
❶ Value Log Append
❷ MemTableUpdate
< key k, value offset >
❸ SSTableFlush
Lvl. 1 SSTable
SSTable
SSTable
SSTable SSTable
Full
Lvl. 2
MemTable
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
14
Put (key k, value v)
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable Value Log
< offset >
❶ Value Log Append
❷ MemTableUpdate
< key k, value offset >
❸ SSTableFlush
Lvl. 1 SSTable SSTable
SSTable SSTable
Full
Overlappedkey range
Lvl. 2
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
15
Put (key k, value v)
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable
MemTable
Value Log
< offset >
❶ Value Log Append
❷ MemTableUpdate
< key k, value offset >
❸ SSTableFlush
Lvl. 1 SSTable SSTable
SSTable SSTable
Full
Overlappedkey range
❹1 Read for Compaction
Lvl. 2
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
16
Put (key k, value v)
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable
MemTable
Value Log
< offset >
❶ Value Log Append
❷ MemTableUpdate
< key k, value offset >
❸ SSTableFlush
Lvl. 1 SSTable SSTable
SSTable SSTable
Full
Overlappedkey range
❹1 Read for Compaction
NewSSTable
Lvl. 2
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
17
Put (key k, value v)
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable
MemTable
Value Log
< offset >
❶ Value Log Append
❷ MemTableUpdate
< key k, value offset >
❸ SSTableFlush
Lvl. 1 SSTable SSTable
SSTable SSTable NewSSTable
Full
Overlappedkey range
❹1 Read for Compaction
NewSSTable
Lvl. 2
❹2 Write for Compaction
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
18
Put (key k, value v)
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable
MemTable
Value Log
< offset >
❶ Value Log Append
❷ MemTableUpdate
< key k, value offset >
❸ SSTableFlush
Lvl. 1 SSTable
SSTable SSTable NewSSTable
NewSSTable
❹2 Write for Compaction
Lvl. 2
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
19
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable
MemTable
Value Log
< offset >
Lvl. 1 SSTable
SSTable SSTable NewSSTable
Get (key k)① MemTable
Search
Lvl. 2
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
20
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0 SSTable
MemTable
Value Log
< offset >
Lvl. 1 SSTable
SSTable SSTable NewSSTable
Get (key k)① MemTable
Search
② SSTableSearch
Lvl. 2
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
21
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0
MemTable
Value Log
< offset >
Lvl. 1 SSTable
SSTable SSTable NewSSTable
Get (key k)① MemTable
Search
② SSTableSearch BloomFilter
Meta(Key, Offset)
SSTable
BF and key matching
Lvl. 2
Log-Structured Merge-tree (LSM-tree) in KVSSD
• Several KVSSDs[1, 2] are implemented based on key-value separated LSM-tree indexing structure[3].
22
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.[2] PinK: High-speed In-storage Key-Value Store with Bounded Tails, USENIX ATC, 2020.[3] WiscKey: Separating Keys from Values in SSD-Conscious Storage, USENIX FAST, 2016.
DRAM
Flash
Lvl. 0
MemTable
Value Log
< offset >
Lvl. 1 SSTable
SSTable SSTable NewSSTable
Get (key k)① MemTable
Search
② SSTableSearch BloomFilter
Meta(Key, Offset)
SSTable
BF and key matching
< value >
③ Value Retrieval
Lvl. 2
Problems of the current LSM-Tree based KVSSD
• Problem 1: Lack multi-tenancy and namespace isolation support.• Multi-tenancy is an architecture that can host multiple DB instances of tenants on a server.
23
VS.
<Single-tenant vs. Multi-tenant>
Problems of the current LSM-Tree based KVSSD
• Problem 1: Lack multi-tenancy and namespace isolation support.• Multi-tenancy is an architecture that can host multiple DB instances of tenants on a server.
24
Security
Privacy
Performance
VS.
<Single-tenant vs. Multi-tenant>
Each tenant has their own dedicated server[4].
[4] pClock: An Arrival Curve Based Approach for QoS Guarantees in Shared Storage Systems, ACM SIGMETRICS, 2007.
Problems of the current LSM-Tree based KVSSD
• Problem 1: Lack multi-tenancy and namespace isolation support. • To this end, namespace isolation is supported.
25
KV
KVKV
KV KV
KV
KV
KV
Logically separated KV dataShared
storage <Namespace isolation>
Problems of the current LSM-Tree based KVSSD
• Problem 1: Lack multi-tenancy and namespace isolation support.• To this end, namespace isolation is supported.
26
However, current LSM-tree based KVSSDs lack design and implementationfor namespace isolation.
<Namespace isolation>
KV
KVKV
KV KV
KV
KV
KV
Logically separated KV dataShared
storage
Problems of the current LSM-Tree based KVSSD (Cont.)
• Problem 2: Limited per-tenant read performance • Multiple KV data of tenants are still managed by a global-shared single LSM-tree.
27
KV
Upper level
Lower level
<Global-shared LSM-tree>
Problems of the current LSM-Tree based KVSSD (Cont.)
• Problem 2: Limited per-tenant read performance • Multiple KV data of tenants are still managed by a global-shared single LSM-tree.
28<Global-shared LSM-tree>
KV
Upper level
Lower level
Read is too slow.Read is too slow.Read is too slow.
LSM-treesearch order
Problems of the current LSM-Tree based KVSSD (Cont.)
• Problem 2: Limited per-tenant read performance • Multiple KV data of tenants are still managed by a global-shared single LSM-tree.
29
KV
Upper level
Lower level
Read is too slow.Read is too slow.Read is too slow.
Reason 1: Most tenants’ KV data will be indexed at
upper level.LSM-tree
search order
<Global-shared LSM-tree>
Problems of the current LSM-Tree based KVSSD (Cont.)
• Problem 2: Limited per-tenant read performance • Multiple KV data of tenants are still managed by a global-shared single LSM-tree.
30
KV
Upper level
Lower level
Read is too slow.Read is too slow.Read is too slow.
LSM-treesearch order
Reason 2: During LSM-tree search, BF loads are
performed several times.
Reason 1: Most tenants’ KV data will be indexed at
upper level.
<Global-shared LSM-tree>
Problems of the current LSM-Tree based KVSSD (Cont.)
• Problem 2: Limited per-tenant read performance • Multiple KV data of tenants are still managed by a global-shared single LSM-tree.
31<Global-shared single LSM-tree>
KV
Upper level
Lower level
Read is too slow.Read is too slow.Read is too slow.
LSM-treesearch order
Current LSM-tree based KVSSDs have difficulty in providing the promised read performance that storage device can provide.
Reason 2: During LSM-tree search, BF loads are
performed several times.
Reason 1: Most tenants’ KV data will be indexed at
upper level.
Motivation Experiment
• Configuration• iLSM-SSD[1], recent LSM-tree based KVSSD.• Key size: 8B, Value size: 1KB.• # of KV requests issued (per tenant): 1M.
• KV tenant Read Scenarios• (1): When only tenant x’s KV data occupies a LSM-tree.• (2): When LSM-tree is shared by tenant x’s and y’s
own KV data at the same time.
32
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.
Motivation Experiment
• Configuration• iLSM-SSD[1], recent LSM-tree based KVSSD.• Key size: 8B, Value size: 1KB.• # of KV requests issued (per tenant): 1M.
• KV tenant Read Scenarios• (1): When only tenant x’s KV data occupies a LSM-tree.• (2): When LSM-tree is shared by tenant x’s and y’s
own KV data at the same time.
• Result & Analysis (from the tenant x’s perspective)• Response time: (1) << (2).
33
CDF(%)
0
20
40
60
80
100
Latency(us)
1000 2000 3000 4000 5000 6000
<Response time for read>
Long response time
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.
😄 😫
Motivation Experiment
• Configuration• iLSM-SSD[1], recent LSM-tree based KVSSD.• Key size: 8B, Value size: 1KB.• # of KV requests issued (per tenant): 1M.
• KV tenant Read Scenarios• (1): When only tenant x’s KV data occupies a LSM-tree.• (2): When LSM-tree is shared by tenant x’s and y’s
own KV data at the same time.
• Result & Analysis (from the tenant x’s perspective)• Response time: (1) << (2).• Reason 1: 67% of tenant x’s KV data are indexed at L2.• Reason 2: # of BF loads are increased by 77%.
34
CDF(%)
0
20
40
60
80
100
Latency(us)
1000 2000 3000 4000 5000 6000
OneTenantTwoTenants
CDF(%)
0
20
40
60
80
100
MemT
Lv.0SST
Lv.1SST
Lv.2SST
Lv.3SST
<Response time for read>
Long response time
<Level of accessed data>
[1] iLSM-SSD: An Intelligent LSM-tree Based Key-Value SSD for Data Analytics, MASCOTS, 2019.
2/3 data is indexedat higher level.
😄 😫
😄 😫
Design Goals
• Therefore, we have the following design goals for multi-tenant KVSSD.
(1) Multi-tenant KVSSD supports namespace isolation.
(2) Multi-tenant KVSSD minimizes read performance overhead for performance isolation.
35
Design Goals
• Therefore, we have the following design goals for multi-tenant KVSSD.
(1) Multi-tenant KVSSD supports namespace isolation.
(2) Multi-tenant KVSSD minimizes read performance overhead for performance isolation.
36
We propose a multi-tenant Iso-KVSSD, satisfying these two goals.
Per-namespace dedicated LSM-tree
• Iso-KVSSD employs per-namespace dedicated LSM-tree design.
37
DRAM
Flash
MemTable
OffsetKey
Namespace0x109
Namespace D
SSTableSharedLvl. 0
⋮ ⋮ ⋮ ⋮
SegregatedLvl. 1
Lvl. 2, 3, ..
OffsetKey
Namespace
0xAC20A
0x114890B
0xAF75C
0x1475D
Meta
⋮Flash Memory
Pool
Pooled allocation
0xF23
Namespace A
Get (key k, namespace ns)
LSM-tree DLSM-tree CLSM-tree BLSM-tree A
Namespace B Namespace C Namespace DNamespace A
Put (key k, value v, namespace ns) via NSID region of NVMe command
Per-namespace dedicated LSM-tree
• Iso-KVSSD employs per-namespace dedicated LSM-tree design.
38
DRAM
Flash
MemTable
OffsetKey
Namespace0x109
Namespace D
SSTableSharedLvl. 0
⋮ ⋮ ⋮ ⋮
SegregatedLvl. 1
Lvl. 2, 3, ..
OffsetKey
Namespace
0xAC20A
0x114890B
0xAF75C
0x1475D
Meta
⋮Flash Memory
Pool
Pooled allocation
0xF23
Namespace A
Get (key k, namespace ns)
LSM-tree DLSM-tree CLSM-tree BLSM-tree A
Namespace B Namespace C Namespace DNamespace A
Put (key k, value v, namespace ns) via NSID region of NVMe command
Per-namespace dedicated LSM-tree
• Iso-KVSSD employs per-namespace dedicated LSM-tree design.
39
MemTable
OffsetKey
Namespace0x109
Namespace D
⋮Flash Memory
Pool
0xF23
Namespace A
Get (key k, namespace ns) Put (key k, value v, namespace ns) via NSID region of NVMe command
Pooled allocation
DRAM
FlashSSTableSharedLvl. 0
⋮ ⋮ ⋮ ⋮
SegregatedLvl. 1
Lvl. 2, 3, ..
OffsetKey
Namespace
0xAC20A
0x114890B
0xAF75C
0x1475D
Meta
LSM-tree DLSM-tree CLSM-tree BLSM-tree A
Namespace B Namespace C Namespace DNamespace A
Per-namespace dedicated LSM-tree
• Iso-KVSSD employs per-namespace dedicated LSM-tree design.
40
Only few KV data are indexed at shared regions.
MemTable provisioning per-namespace requires additional DRAM space.
MemTable
OffsetKey
Namespace0x109
Namespace D
⋮Flash Memory
Pool
0xF23
Namespace A
Get (key k, namespace ns) Put (key k, value v, namespace ns) via NSID region of NVMe command
Pooled allocation
DRAM
FlashSSTableSharedLvl. 0
⋮ ⋮ ⋮ ⋮
SegregatedLvl. 1
Lvl. 2, 3, ..
OffsetKey
Namespace
0xAC20A
0x114890B
0xAF75C
0x1475D
Meta
LSM-tree DLSM-tree CLSM-tree BLSM-tree A
Namespace B Namespace C Namespace DNamespace A
Per-namespace dedicated LSM-tree
• Iso-KVSSD employs per-namespace dedicated LSM-tree design.
41
MemTable
OffsetKey
Namespace0x109
Namespace D
⋮Flash Memory
Pool
0xF23
Namespace A
Get (key k, namespace ns) Put (key k, value v, namespace ns) via NSID region of NVMe command
MemTable provisioning per-namespace requires additional DRAM space.
It prevents KV data from being indexed at upper level of LSM-tree.
Pooled allocation
DRAM
FlashSSTableSharedLvl. 0
⋮ ⋮ ⋮ ⋮
SegregatedLvl. 1
Lvl. 2, 3, ..
OffsetKey
Namespace
0xAC20A
0x114890B
0xAF75C
0x1475D
Meta
LSM-tree DLSM-tree CLSM-tree BLSM-tree A
Namespace B Namespace C Namespace DNamespace A
Only few KV data are indexed at shared regions.
Per-namespace dedicated LSM-tree
• Iso-KVSSD employs per-namespace dedicated LSM-tree design.
42
MemTable provisioning per-namespace requires additional DRAM space.
It prevents KV data from being indexed at upper level of LSM-tree.
MemTable
OffsetKey
Namespace0x109
Namespace D
⋮Flash Memory
Pool
0xF23
Namespace A
Get (key k, namespace ns) Put (key k, value v, namespace ns) via NSID region of NVMe command
Pooled allocation
DRAM
FlashSSTableSharedLvl. 0
⋮ ⋮ ⋮ ⋮
SegregatedLvl. 1
Lvl. 2, 3, ..
OffsetKey
Namespace
0xAC20A
0x114890B
0xAF75C
0x1475D
Meta
LSM-tree DLSM-tree CLSM-tree BLSM-tree A
Namespace B Namespace C Namespace DNamespace A
• Iso-KVSSD controls access based on user’s namespace information.• Per-namespace LSM-tree design reduces KV data’s access latency.
Only few KV data are indexed at shared regions.
Namespace Isolation Mechanism
• Namespace Isolation Mechanism segregates KV data into per-namespace LSM-trees.
43
SharedLvl. 0
L GD Q
B KP E
⋮ ⋮ ⋮ ⋮
C A Z A BC D
SegregatedLvl. 1
Lvl. 2, 3, ..LSM-tree 4LSM-tree 3LSM-tree 2LSM-tree 1
Namespace 2 Namespace 3 Namespace 4Namespace 1
DRAM
Flash
New SSTableVictim SSTable
Namespace Isolation Mechanism
• Namespace Isolation Mechanism segregates KV data into per-namespace LSM-trees.
44
New SSTableVictim SSTable
① Lvl.0 Victim SSTable Load
⋮ ⋮ ⋮ ⋮
CSegregatedLvl. 1
Lvl. 2, 3, ..LSM-tree 4LSM-tree 3LSM-tree 2LSM-tree 1
Namespace 2 Namespace 3 Namespace 4Namespace 1
SharedLvl. 0
L GD Q
B KP E
DRAM
Flash
A Z A BC D
KV data is isolated during existing
background compaction.
Namespace Isolation Mechanism
• Namespace Isolation Mechanism segregates KV data into per-namespace LSM-trees.
45
New SSTableVictim SSTable
① Lvl.0 Victim SSTable Load
L GD Q
B KP E
Lvl.0 Victim SSTable
② KeyRange Check
B ~ L G ~ P
D ~ K Q
⋮ ⋮ ⋮ ⋮
CSegregatedLvl. 1
Lvl. 2, 3, ..LSM-tree 4LSM-tree 3LSM-tree 2LSM-tree 1
Namespace 2 Namespace 3 Namespace 4Namespace 1
SharedLvl. 0
L GD Q
B KP E
DRAM
Flash
A Z A BC D
KV data is isolated during existing
background compaction.
Namespace Isolation Mechanism
• Namespace Isolation Mechanism segregates KV data into per-namespace LSM-trees.
46
New SSTableVictim SSTable
③ Lvl.1 Victim SSTable Load
C A Z
Lvl.1 Victim SSTable
② KeyRange Check
⋮ ⋮ ⋮ ⋮
CSegregatedLvl. 1
Lvl. 2, 3, ..LSM-tree 4LSM-tree 3LSM-tree 2LSM-tree 1
Namespace 2 Namespace 3 Namespace 4Namespace 1
SharedLvl. 0
L GD Q
B KP E
① Lvl.0 Victim SSTable LoadDRAM
Flash
A Z A BC D
L GD Q
B KP E
Lvl.0 Victim SSTable
B ~ L G ~ P
D ~ K Q
KV data is isolated during existing
background compaction.
Namespace Isolation Mechanism
• Namespace Isolation Mechanism segregates KV data into per-namespace LSM-trees.
47
New SSTableVictim SSTable
③ Lvl.1 Victim SSTable Load
① Lvl.0 Victim SSTable Load
BE
CL
A GP Z
D K Q
④Compaction per namespace
⋮ ⋮ ⋮ ⋮
CSegregatedLvl. 1
Lvl. 2, 3, ..LSM-tree 4LSM-tree 3LSM-tree 2LSM-tree 1
Namespace 2 Namespace 3 Namespace 4Namespace 1
SharedLvl. 0
L GD Q
B KP E
DRAM
Flash
A Z A BC D
C A Z
Lvl.1 Victim SSTable
② KeyRange Check
L GD Q
B KP E
Lvl.0 Victim SSTable
B ~ L G ~ P
D ~ K Q
KV data is isolated during existing
background compaction.
Namespace Isolation Mechanism
• Namespace Isolation Mechanism segregates KV data into per-namespace LSM-trees.
48
New SSTableVictim SSTable
③ Lvl.1 Victim SSTable Load
① Lvl.0 Victim SSTable Load
④Compaction per namespace
⋮ ⋮ ⋮ ⋮
C B A Z A BA GC DE
QD KSegregatedLvl. 1
Lvl. 2, 3, ..LSM-tree 4LSM-tree 3LSM-tree 2LSM-tree 1
Namespace 2 Namespace 3 Namespace 4Namespace 1
CL P Z
⑤ New SSTableWrite
SharedLvl. 0
L GD Q
B KP E
DRAM
Flash
C A Z
Lvl.1 Victim SSTable
② KeyRange Check
L GD Q
B KP E
Lvl.0 Victim SSTable
B ~ L G ~ P
D ~ K Q
BE
CL
A GP Z
D K QKV data is isolated during existing
background compaction.
Namespace Isolation Mechanism
• Namespace Isolation Mechanism segregates KV data into per-namespace LSM-trees.
49
New SSTableVictim SSTable
③ Lvl.1 Victim SSTable Load
① Lvl.0 Victim SSTable Load
④Compaction per namespace
⋮ ⋮ ⋮ ⋮
C BE
SegregatedLvl. 1
Lvl. 2, 3, ..LSM-tree 4LSM-tree 3LSM-tree 2LSM-tree 1
Namespace 2 Namespace 3 Namespace 4Namespace 1
CL
⑤ New SSTableWrite
SharedLvl. 0
L GD Q
B KP E
DRAM
Flash
C A Z
Lvl.1 Victim SSTable
② KeyRange Check
L GD Q
B KP E
Lvl.0 Victim SSTable
B ~ L G ~ P
D ~ K Q
BE
CL
A GP Z
D K Q
A Z A BA GC D
QD KP Z
Namespace Isolation mechanism substantially segregates KV data without perceivable overhead.
KV data is isolated during existing
background compaction.
Experimental Setup
• Prototyped Iso-KVSSD on FPGA-based Cosmos+ OpenSSD.• 1TB NAND memory, 1GB DDR3 DRAM, ARM Cortex-A9 processors.
• Configuration• Key size: 8B, Value size: 1KB.• # of KV requests issued (per tenant): 1M.
• Workloads• Put() or Get() only synthetic workloads.
• Comparison• Baseline: iLSM-SSD with global-shared LSM-tree.• Iso-KVSSD: iLSM-SSD with per-namespace LSM-tree.
50
Throughput Comparison
51
BaselineIso-KVSSD
Per-TenantThroughput(KIOPS)
0
1000
2000
3000
4000
NumberofKV-tenants
1 2 4 6 8
BaselineIso-KVSSD
Per-TenantThroughput(K
IOPS)
0
100
200
300
400
500
NumberofKV-tenants
1 2 4 6 8
<Throughput Put() only ><Throughput Get() only >
less than 1%overhead
Iso-KVSSD has an average 2.9x higher read throughput than the baseline with negligible write performance overhead.
2.9ximprovement
1.1x
1.9x2.3x
• Level distribution of where KV data is indexed in the LSM-trees.
Impact of Per-namespace LSM-tree: Level Distribution
52
CDF(%)
0
20
40
60
80
100
MemT
Lv.0SST
Lv.1SST
Lv.2SST
Lv.3SST
#oftenants=1#oftenants=2#oftenants=4#oftenants=6#oftenants=8
CDF(%)
0
20
40
60
80
100
MemT
Lv.0SST
Lv.1SST
Lv.2SST
Lv.3SST
<Baseline > <Iso-KVSSD>
Per-namespace LSM-tree significantly reduces level depth at which KV data isindexed in the LSM-tree.
• Level distribution of where KV data is indexed in the LSM-trees.
Impact of Per-namespace LSM-tree: Level Distribution
53
CDF(%)
0
20
40
60
80
100
MemT
Lv.0SST
Lv.1SST
Lv.2SST
Lv.3SST
#oftenants=1#oftenants=2#oftenants=4#oftenants=6#oftenants=8
CDF(%)
0
20
40
60
80
100
MemT
Lv.0SST
Lv.1SST
Lv.2SST
Lv.3SST
83.8%
26.5%
Up to70.1%(in L2)
<Baseline > <Iso-KVSSD>
Per-namespace LSM-tree significantly reduces level depth at which KV data isindexed in the LSM-tree.
• Level distribution of where KV data is indexed in the LSM-trees.
Impact of Per-namespace LSM-tree: Level Distribution
54
CDF(%)
0
20
40
60
80
100
MemT
Lv.0SST
Lv.1SST
Lv.2SST
Lv.3SST
#oftenants=1#oftenants=2#oftenants=4#oftenants=6#oftenants=8
CDF(%)
0
20
40
60
80
100
MemT
Lv.0SST
Lv.1SST
Lv.2SST
Lv.3SST
Not reach to L2
<Baseline > <Iso-KVSSD>
Per-namespace LSM-tree significantly reduces level depth at which KV data isindexed in the LSM-tree.
• The number of Bloom filter (BF) loads during LSM-tree search
Impact of Per-namespace LSM-tree: # of Bloom Filter Loads
55
Baseline
Iso-KVSSD
#ofBloomFilterLoad
0
5×106
107
1.5×107
2×107
NumberofKV-tenants
1 2 4 6 8
3.6x fewerBF loads
Per-namespace LSM-tree significantly reduces the number of BF loads duringKV data search process.
Conclusion
• Iso-KVSSD with per-namespace LSM-tree design• Identifies the user’s namespace information for namespace isolation.• Manages the KV data using per-namespace LSM-tree design for performance isolation.
• Provides strict view showing only the KV data corresponding to each user’s namespace.• Offers 2.9x higher per-tenant read throughput and 2.8x lower per-tenant read response time than the
baseline with a global-shared LSM-tree.
56
DISCOSLABORATORY
Isolating Namespace and Performance in Key-Value SSDs forMulti-tenant Environments
Donghyun [email protected]
57