approximate+indexing+with+bf4trees* arum+access+method · rum:+read+vsupdate+vsmemory+ 6...
TRANSCRIPT
![Page 1: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/1.jpg)
Approximate Indexing with BF-‐Trees*A RUM access method
Manos Athanassoulis* Anastasia AilamakiHarvard SEAS EPFL
*work done while at EPFL
![Page 2: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/2.jpg)
Tree indexing
2… is designed for disks
wide and short trees
… which have large size in order to minimizing random accesses
This is not enough!
![Page 3: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/3.jpg)
A brave new (storage) world
3
![Page 4: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/4.jpg)
A brave new (storage) world
4memory price varies
read performance varies
update cost varies
![Page 5: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/5.jpg)
RUM: Read vs Update vs Memory
5
Read Optimized
Update Optimized Memory/Storage Optimized
Access Method
![Page 6: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/6.jpg)
RUM: Read vs Update vs Memory
6
Read Optimized
Update Optimized Memory/Storage Optimized
HDD-‐based access methods
Flash-‐aware access methodsSSD-‐based access methods
LA-‐Tree [PVLDB09]FD-‐Tree [PVLDB10]μ-‐Tree [EMSOFT10]SILT [SOSP11]MaSM [SIGMOD11]PIO B-‐Tree [PVLDB11]Bw-‐Tree [ICDE13]
![Page 7: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/7.jpg)
Memory Price vs. Reads
7
Better
Better
High PerformanceExpensive Memory
Low PerformanceCheap Memory
rethink indexing!
exchange more reads for lower size
![Page 8: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/8.jpg)
RUM: Read vs Update vs Memory
8
Read Optimized
Update Optimized Memory/Storage Optimized
Approximate Indexing withBloom filter Tree
Future exploration
let’s see how BF-‐Tree works
![Page 9: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/9.jpg)
Flash-‐aware indexing
focuses on internal node organization
lazy updates
immutable data
9what about solid-‐state storage fast reads?
LA-‐Tree [PVLDB09]FD-‐Tree [PVLDB10]μ-‐Tree [EMSOFT10]SILT [SOSP11]MaSM [SIGMOD11]PIO B-‐Tree [PVLDB11]Bw-‐Tree [ICDE13]
![Page 10: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/10.jpg)
Approximate Tree IndexingDesign choices
use Bloom filters for membership queries per page
tunable tree sizeprobabilistically tunable random reads
Caveatworks well for datasets with implicit clustering
10how common is implicit clustering?
![Page 11: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/11.jpg)
Implicit Clustering
11
TPCH data: transaction dates
data organized based on creation time
![Page 12: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/12.jpg)
Implicit Clustering
12
Electricity consumption – Smart Home Dataset (SHD)
data values correlated with creation time
how can BF-‐Tree index such data?
![Page 13: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/13.jpg)
Bloom filter Treesselect desired tree size (aka BF per page size)
each partition has about the same unique valueshas a partition-‐wide min and maxhas a (variable) number of physical pages
13
Pj-‐2 Pj-‐1 Pj …… Pj+1
[-‐10,-‐5) [-‐5,0) [1,9) [9,11)
![Page 14: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/14.jpg)
for every page of every partitionbuild BF with desired size (and, hence, false positive)
build a B+-‐Tree on top of all partitions using the min/max as keys
Bloom filter Trees
14
Pj-‐2 Pj-‐1 Pj …… Pj+1
BF BF BF BFBF BF BF BF
[-‐10,-‐5) [-‐5,0) [1,9) [9,11)
![Page 15: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/15.jpg)
Bloom filter Trees
15
……
BF BF BF BFBF BF BF BF
[-‐10,-‐5) [-‐5,0) [1,9) [9,11)
Pj-‐2 Pj-‐1 Pj Pj+1
![Page 16: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/16.jpg)
Bloom filter Trees
16
……
Partition Pj with k pages
Partition PjAll k pages contain values between partition-‐widemin and max.
1,3 2,5 3,6
min:1 max: 8
BF BF BF BF
4,8
![Page 17: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/17.jpg)
Bloom filter Trees
17
……
Partition Pj with k pages
BF BF BF BF
1,3 2,5 4,8
min:1 max: 8
Search for 3
3,6
![Page 18: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/18.jpg)
Bloom filter Trees
18
……
Partition Pj with k pages
BF BF BF BF
1,3 2,5 4,8
min:1 max: 8
Search for 3
3,6
![Page 19: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/19.jpg)
Bloom filter Trees
19
……
Partition Pj with k pages
BF BF BF BF
1,3 2,5 4,8
min:1 max: 8
Search for 3
3,6
![Page 20: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/20.jpg)
Bloom filter Trees
20
BF
……
BF BFBF-‐leafk Bloom filtersK pages in the partition Pj
BF
Partition Pj with k pages
1,3 2,5 4,8
min:1 max: 8
Search for 3
3,6
![Page 21: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/21.jpg)
1,3 2,5 4,8
min:1 max: 8
Bloom filter Trees
21
BF
……
BF BF BF
Partition Pj with k pages
Search for 3
3,6
![Page 22: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/22.jpg)
Bloom filter Trees
22
BF
……
BF BF BF
Partition Pj with k pagesretrieve and search for desired value1,3 2,5 4,8 3,6
min:1 max: 8
Search for 3
false positives are also possible
![Page 23: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/23.jpg)
Bloom filter Trees
23
BF
……
BF BF BF
Partition Pj with k pages
1,3 2,5 4,8 3,6
min:1 max: 8
Search for 3
False positive
![Page 24: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/24.jpg)
BF-‐Tree Design
Variable false positive probability (fpp)
24
BF BF BF BFBF
BF
BF
BF
BF
BF
BF
BF
BFs have tunable size
tunable size à variable performance
BF BF BF BFBF
BF
BF
BF
BF
BF
BF
BF
False positive
p1 = 0.01%If BF size is halfp2 = 1%
![Page 25: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/25.jpg)
BF-‐Trees in actionDatasets
1GB synthetic with 256b tuples and 8b keys30GB TPCH (SF30)Smart Home Dataset (SHD)
WorkloadPoint queries (PK or TPCH date or energy level)
5 storage configurations (index/data)mem/SSDmem/HDDSSD/SSDSSD/HDDHDD/HDD
25
![Page 26: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/26.jpg)
1.E-‐01
1.E+00
1.E+01
1.E+02
1.00E-‐151.00E-‐121.00E-‐091.00E-‐061.00E-‐031.00E+00
Respon
se time (m
s)
mem/SSD mem/HDD SSD/SSDSSD/HDD HDD/HDD
BF-‐Trees for PKaverage index probe time for 1GB relationvarying
false positive probability; storage configuration
26
false positive probability
1.E-‐01
1.E+00
1.E+01
1.E+02B+-‐Tree Latency
Tuplesize: 256 bytesKeysize: 8 bytes
Bigger Tree Size
![Page 27: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/27.jpg)
1.E-‐01
1.E+00
1.E+01
1.E+02
1.00E-‐151.00E-‐121.00E-‐091.00E-‐061.00E-‐031.00E+00
Respon
se time (m
s)
mem/SSD mem/HDD SSD/SSDSSD/HDD HDD/HDD
BF-‐Trees for PKaverage index probe time for 1GB relationvarying
false positive probability; storage configuration
27
false positive probability
1.E-‐01
1.E+00
1.E+01
1.E+02B+-‐Tree Latency
Tuplesize: 256 bytesKeysize: 8 bytes
Bigger Tree Size
Data location matters most
Both data/index locations matter
what about the tree size?
![Page 28: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/28.jpg)
average index probe time for 1GB relationvarying
false positive probability; storage configuration
1.E-‐01
1.E+00
1.E+01
1.E+02
mem/SSD mem/HDD SSD/SSD SSD/HDD HDD/HDD
Respon
se time (m
s)
Solid: B+-‐Tree Pattern: BF-‐Tree (best)
BF-‐Tree vs B+-‐Tree: Size & Latency
28
Tuplesize: 256 bytesKeysize: 8 bytes
3.8x smaller size 12.2x
19.4x
![Page 29: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/29.jpg)
average index probe time for 1GB relationvarying
false positive probability; storage configuration
1.E-‐01
1.E+00
1.E+01
1.E+02
mem/SSD mem/HDD SSD/SSD SSD/HDD HDD/HDD
Respon
se time (m
s)
Solid: B+-‐Tree Pattern: BF-‐Tree (best)
BF-‐Tree vs B+-‐Tree: Size & Latency
29
Tuplesize: 256 bytesKeysize: 8 bytes
3.8x smaller size
competitive performance with space savings
12.2x19.4x
![Page 30: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/30.jpg)
BF-‐Tree for SHD
30
0.25
6.2
0.7
6.2
22
0.31
6.0
0.6
6.2
16
1
2
3
4
1.E-‐01
1.E+00
1.E+01
1.E+02
mem/SSD mem/HDD SSD/SSD SSD/HDD HDD/HDD
Capa
city Gain
Respon
se time (m
s)
Solid: B+-‐TreePattern: BF-‐Tree (best)
![Page 31: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/31.jpg)
TPCH point queries on date
31
0
1
2
3
4
5
6
7
8
9
0% 5% 10% 25%
BF normalize
d resp. tim
e with
B+Tree
Probe hit rate
mem/SSD mem/HDD SSD/SSD SSD/HDD HDD/HDD
BF-‐Tree is always faster for low hit rate
High hit rate: B+ Tree is faster
Data on HDD à High overhead (unless index is slow)
Index perf. ≈ data perf. à Low overhead
Cardinality: 2k values
![Page 32: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/32.jpg)
Approximate Tree Indexingtunable size à variable performance
competitive resp. time w/ 4-‐20x capacity savings
tailored for:datasets with implicit clusteringworkloads with low hit rate
more details in paper:analytical modeling, range scansupdates, datasets, more comparisons
32
![Page 33: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/33.jpg)
RUM Tunable Indexing
33
Read Optimized
Update Optimized Memory/Storage Optimized
HDD-‐based access methods
Flash-‐aware access methodsSSD-‐based access methods
LA-‐Tree [PVLDB09]FD-‐Tree [PVLDB10]μ-‐Tree [EMSOFT10]SILT [SOSP11]MaSM [SIGMOD11]PIO B-‐Tree [PVLDB11]Bw-‐Tree [ICDE13]
![Page 34: Approximate+Indexing+with+BF4Trees* ARUM+access+method · RUM:+Read+vsUpdate+vsMemory+ 6 Read+Optimized Update+Optimized Memory/Storage+Optimized HDD4basedaccess+methods Flash4aware+access+methods](https://reader034.vdocuments.mx/reader034/viewer/2022042106/5e84abc042399d245909e292/html5/thumbnails/34.jpg)
RUM Tunable Indexing
34
Read Optimized
Update Optimized Memory/Storage Optimized
http://daslab.seas.harvard.edu/rum-‐conjecture/
Thanks!Questions?