code specialization for memory efficient hash tries · code specialization for memory efficient...
TRANSCRIPT
![Page 1: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/1.jpg)
Code Specialization for Memory Efficient
Hash Tries
Michael Steindorfer, Jurgen Vinju Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
![Page 2: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/2.jpg)
0%
25%
50%
75%
100%
MapGeneric Specialized
45%
100%
2
![Page 3: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/3.jpg)
0%
25%
50%
75%
100%
SetGeneric Specialized
22%
100%
3
![Page 4: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/4.jpg)
• memory usage vs runtime
• size of source code or binary
• platform specifics
![Page 5: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/5.jpg)
Hash Tries
![Page 6: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/6.jpg)
Hash TriesFast Immutable Data Structures on the JVM
![Page 7: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/7.jpg)
Hash Tries(Wide) Hash-Prefix Trees with Array Nodes
![Page 8: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/8.jpg)
{32, 2, 4098, 34}
8
![Page 9: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/9.jpg)
2
320 1 2 3 4 5 6 7 8
...... 31
034
1 2 3 4 5 6 7 8...
... 31
20 1 2 3
40984 5 6 7 8
...... 31
(a) 33 and 2
320 1 2 3 4 5 6
...... 31
034
1 2 3 4 5 6...
... 31
20 1 2 3
40984 5 6
...... 31
(b) 33 and 2
Figure 1. Inserting a sequence of numbers in a hash array mapped trie. The small index numbers refer to theposition in a sparse array.
insertion calculates the number’s hash code⇤:
hash(32) = . . . 000 00001 000002 = 0 0 0 0 0 1 032
hash(2) = . . . 000 00000 000102 = 0 0 0 0 0 0 232
hash(4098) = . . . 100 00000 000102 = 0 0 0 0 4 0 232
hash(34) = . . . 000 00001 000102 = 0 0 0 0 0 1 232
We first separate the hash codes in chunks of 5-bit to notate chunks as decimal values with rangesfrom 0 to 31. Then insertion places the values in a 32-nary tree (where each is encoded as a sparsearray), based on the hash code prefixes. The tree structure gets expanded until every prefix can beunambiguously stored.
To continue our example: 32 is inserted at the root node; 2 as well (because they do not sharea common prefix). 4098 shares the prefix path !0!2 with value 2, consequently it is placedunambiguously on level 3. 32 shares the prefix path !2 with 2 and 4098, but can be differentiatedon level 2 from both.
Note, that a chunk size of 5-bit for 32-bit hash codes results in trees with a maximal depth ofdlog32(232)e = 7.
In contrast Figure 3 illustrates an array-based hash-table. By comparing the visualizations of bothdata structures we can identify the following list of disadvantages of HAMTs over array-based hashtables (Section ?? provides evidence for our claims):
• Lookup has to follow a tree path of length between 1–7 nodes answer containment queries.Memory indirections and indexing into sparse arrays is less efficient than performing a singleindex-based lookup in continuous array.
⇤We assume a hash function for integers returns the argument, i.e. identity.
Copyright c� 0000 John Wiley & Sons, Ltd. Softw. Pract. Exper. (0000)Prepared using speauth.cls DOI: 10.1002/spe
2
320 1 2 3 4 5 6 7 8
...... 31
034
1 2 3 4 5 6 7 8...
... 31
20 1 2 3
40984 5 6 7 8
...... 31
(a) 33 and 2
320 1 2 3 4 5 6
...... 31
034
1 2 3 4 5 6...
... 31
20 1 2 3
40984 5 6
...... 31
(b) 33 and 2
Figure 1. Inserting a sequence of numbers in a hash array mapped trie. The small index numbers refer to theposition in a sparse array.
insertion calculates the number’s hash code⇤:
hash(32) = . . . 000 00001 000002 = 0 0 0 0 0 1 032
hash(2) = . . . 000 00000 000102 = 0 0 0 0 0 0 232
hash(4098) = . . . 100 00000 000102 = 0 0 0 0 4 0 232
hash(34) = . . . 000 00001 000102 = 0 0 0 0 0 1 232
We first separate the hash codes in chunks of 5-bit to notate chunks as decimal values with rangesfrom 0 to 31. Then insertion places the values in a 32-nary tree (where each is encoded as a sparsearray), based on the hash code prefixes. The tree structure gets expanded until every prefix can beunambiguously stored.
To continue our example: 32 is inserted at the root node; 2 as well (because they do not sharea common prefix). 4098 shares the prefix path !0!2 with value 2, consequently it is placedunambiguously on level 3. 32 shares the prefix path !2 with 2 and 4098, but can be differentiatedon level 2 from both.
Note, that a chunk size of 5-bit for 32-bit hash codes results in trees with a maximal depth ofdlog32(232)e = 7.
In contrast Figure 3 illustrates an array-based hash-table. By comparing the visualizations of bothdata structures we can identify the following list of disadvantages of HAMTs over array-based hashtables (Section ?? provides evidence for our claims):
• Lookup has to follow a tree path of length between 1–7 nodes answer containment queries.Memory indirections and indexing into sparse arrays is less efficient than performing a singleindex-based lookup in continuous array.
⇤We assume a hash function for integers returns the argument, i.e. identity.
Copyright c� 0000 John Wiley & Sons, Ltd. Softw. Pract. Exper. (0000)Prepared using speauth.cls DOI: 10.1002/spe
9
![Page 10: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/10.jpg)
abstract class TrieSet implements java.util.Set {
TrieNode root; int size;
class TrieNode { int bitmap; Object[] contentAndSubTries; } }
4
0 2
320 1
0 434
2 4098
(a) Scala
320 2
034
1
20
40984
(b) Clojure
232
0
034
1
20
40984
(c) Clojure
Figure 3. Conceptual difference in tree layout between Clojure’s and Scala’s HAMT implementations.
Figure 4. Footprints of HAMT sets and HAMT maps in 32-bit and 64-bit environments. Defaulting to 5-bitprefix chunks.
subnode pointers. Michael IDon’t talk explicitly about Clojure/Scala, rather about mixed/separatedHAMT node designs.I
Division between internal nodes and leaf nodes. Scala’s HAMT implementations divide the treestructure into internal nodes and leaf nodes. The internal nodes amount for the hash-based prefixtree structure. The leaf nodes encapsulate the data tuple (i.e, a key in case of a set, a key/value pair
Copyright c� 0000 John Wiley & Sons, Ltd. Softw. Pract. Exper. (0000)Prepared using speauth.cls DOI: 10.1002/spe
10
![Page 11: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/11.jpg)
abstract class TrieSet implements java.util.Set {
TrieNode root; int size;
class TrieNode { int bitmap; Object[] contentAndSubTries; } }
4
0 2
320 1
0 434
2 4098
(a) Scala
320 2
034
1
20
40984
(b) Clojure
232
0
034
1
20
40984
(c) Clojure
Figure 3. Conceptual difference in tree layout between Clojure’s and Scala’s HAMT implementations.
Figure 4. Footprints of HAMT sets and HAMT maps in 32-bit and 64-bit environments. Defaulting to 5-bitprefix chunks.
subnode pointers. Michael IDon’t talk explicitly about Clojure/Scala, rather about mixed/separatedHAMT node designs.I
Division between internal nodes and leaf nodes. Scala’s HAMT implementations divide the treestructure into internal nodes and leaf nodes. The internal nodes amount for the hash-based prefixtree structure. The leaf nodes encapsulate the data tuple (i.e, a key in case of a set, a key/value pair
Copyright c� 0000 John Wiley & Sons, Ltd. Softw. Pract. Exper. (0000)Prepared using speauth.cls DOI: 10.1002/spe
11
![Page 12: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/12.jpg)
abstract class TrieSet implements java.util.Set {
TrieNode root; int size;
class TrieNode { int bitmap; Object[] contentAndSubTries; } }
4
0 2
320 1
0 434
2 4098
(a) Scala
320 2
034
1
20
40984
(b) Clojure
232
0
034
1
20
40984
(c) Clojure
Figure 3. Conceptual difference in tree layout between Clojure’s and Scala’s HAMT implementations.
Figure 4. Footprints of HAMT sets and HAMT maps in 32-bit and 64-bit environments. Defaulting to 5-bitprefix chunks.
subnode pointers. Michael IDon’t talk explicitly about Clojure/Scala, rather about mixed/separatedHAMT node designs.I
Division between internal nodes and leaf nodes. Scala’s HAMT implementations divide the treestructure into internal nodes and leaf nodes. The internal nodes amount for the hash-based prefixtree structure. The leaf nodes encapsulate the data tuple (i.e, a key in case of a set, a key/value pair
Copyright c� 0000 John Wiley & Sons, Ltd. Softw. Pract. Exper. (0000)Prepared using speauth.cls DOI: 10.1002/spe
12
![Page 13: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/13.jpg)
13
... class NodeNode extends TrieNode { int bitmap; TrieNode nodeAtIndex0; TrieNode nodeAtIndex1; } class ElementNode extends TrieNode { int bitmap; Object keyAtIndex0; TrieNode nodeAtIndex1; } class NodeElement extends TrieNode { int bitmap; TrieNode nodeAtIndex0; Object keyAtIndex1; } ...
class TrieNode { int bitmap; Object[] contentAndSubTries; }
![Page 14: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/14.jpg)
class TrieNode { int bitmap; Object[] contentAndSubTries; }
14
... class NodeNode extends TrieNode { int bitmap; TrieNode nodeAtIndex0; TrieNode nodeAtIndex1; } class ElementNode extends TrieNode { int bitmap; Object keyAtIndex0; TrieNode nodeAtIndex1; } class NodeElement extends TrieNode { int bitmap; TrieNode nodeAtIndex0; Object keyAtIndex1; } ...
![Page 15: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/15.jpg)
ExponentialNumber of Specializations
15
![Page 16: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/16.jpg)
Memory Overhead per Pointer (Set, 32-bit)
0 Bytes
8 Bytes
16 Bytes
24 Bytes
32 Bytes
40 Bytes
48 Bytes
1-ary 2-ary 3-ary 4-ary 5-ary 6-ary 7-ary 8-ary 9-ary 10-ary 11-ary 12-ary
7,38,08,08,99,010,310,712,814,0
18,7
24,0
48,0
16
![Page 17: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/17.jpg)
Frequency by Node Arity
0%
10%
20%
30%
40%
50%
60%
70%
0-ary 1-ary 2-ary 3-ary 4-ary 5-ary 6-ary 7-ary 8-ary 9-ary 10-ary 11-ary 12-ary
1%1%1%1%1%1%1%1%3%
14%
63%
1%0%
17
![Page 18: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/18.jpg)
Arities % of Nodes
≤4 82%
≤8 86%
≤12 90%
18
![Page 19: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/19.jpg)
Arities Specializations
≤4 31
≤8 511
≤12 8191
19
![Page 20: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/20.jpg)
Avoiding Permutations
20
![Page 21: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/21.jpg)
Arities Specializations
≤4 15 (31)
≤8 45 (511)
≤12 91 (8191)
21
![Page 22: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/22.jpg)
abstract class TrieSet implements java.util.Set { TrieNode root; int size;
interface TrieNode { ... } ... class NodeNode extends TrieNode { int bitmap; TrieNode nodeAtIndex0; TrieNode nodeAtIndex1; } class ElementNode extends TrieNode { int bitmap; Object keyAtIndex0; TrieNode nodeAtIndex1; } class NodeElement extends TrieNode { int bitmap; TrieNode nodeAtIndex0; Object keyAtIndex1; } class ElementElement extends TrieNode { int bitmap; Object keyAtIndex0; Object keyAtIndex1; } ... }
4
0 2
320 1
0 434
2 4098
(a) Scala
320 2
034
1
20
40984
(b) Clojure
232
0
034
1
20
40984
(c) Clojure
Figure 3. Conceptual difference in tree layout between Clojure’s and Scala’s HAMT implementations.
Figure 4. Footprints of HAMT sets and HAMT maps in 32-bit and 64-bit environments. Defaulting to 5-bitprefix chunks.
subnode pointers. Michael IDon’t talk explicitly about Clojure/Scala, rather about mixed/separatedHAMT node designs.I
Division between internal nodes and leaf nodes. Scala’s HAMT implementations divide the treestructure into internal nodes and leaf nodes. The internal nodes amount for the hash-based prefixtree structure. The leaf nodes encapsulate the data tuple (i.e, a key in case of a set, a key/value pair
Copyright c� 0000 John Wiley & Sons, Ltd. Softw. Pract. Exper. (0000)Prepared using speauth.cls DOI: 10.1002/spe
22
![Page 23: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/23.jpg)
23
4
0 2
320 1
0 434
2 4098
(a) Scala
320 2
034
1
20
40984
(b) Clojure
232
0
034
1
20
40984
(c) Clojure
Figure 3. Conceptual difference in tree layout between Clojure’s and Scala’s HAMT implementations.
Figure 4. Footprints of HAMT sets and HAMT maps in 32-bit and 64-bit environments. Defaulting to 5-bitprefix chunks.
subnode pointers. Michael IDon’t talk explicitly about Clojure/Scala, rather about mixed/separatedHAMT node designs.I
Division between internal nodes and leaf nodes. Scala’s HAMT implementations divide the treestructure into internal nodes and leaf nodes. The internal nodes amount for the hash-based prefixtree structure. The leaf nodes encapsulate the data tuple (i.e, a key in case of a set, a key/value pair
Copyright c� 0000 John Wiley & Sons, Ltd. Softw. Pract. Exper. (0000)Prepared using speauth.cls DOI: 10.1002/spe
abstract class TrieSet implements java.util.Set { TrieNode root; int size;
interface TrieNode { ... } ... class NodeNode extends TrieNode {
byte pos1; TrieNode nodeAtPos1; byte pos2; TrieNode nodeAtPos2; } class ElementNode extends TrieNode {
byte pos1; Object keyAtPos1; byte pos2; TrieNode nodeAtPos2; } class NodeElement extends TrieNode {
byte pos1; TrieNode nodeAtPos1; byte pos2; Object keyAtPos2; } class ElementElement extends TrieNode {
byte pos1; Object keyAtPos1; byte pos2; Object keyAtPos2; } ... }
![Page 24: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/24.jpg)
Lookup Performance (lower is better)
0%
25%
50%
75%
100%
125%
150%
MapGeneric Specialized 0-4 Specialized 0-8 Specialized 0-12
138%138%130%
100%
24
![Page 25: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/25.jpg)
Memory Usage (lower is better)
0%
25%
50%
75%
100%
Map Set
22%
45%
23%
46%52%
62%
100%100% Generic0-40-80-12
25
![Page 26: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/26.jpg)
Memory Usage (lower is better)
0%
25%
50%
75%
100%
Map Set
22%
45%
23%
46%52%
62%
100%100% Generic0-40-80-12
26
![Page 27: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/27.jpg)
Memory Footprint Compared To Competition (lower is better)
0x
1x
2x
3x
4x
5x
Specialized 0-8 Clojure Scala
3,75x
2,2x
1x
4,9x
1,6x
1x
MapSet
27
![Page 28: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/28.jpg)
worst hash distribution ->
good memory performance
28
![Page 29: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/29.jpg)
best hash distribution ->
worst memory performance
29
![Page 30: Code Specialization for Memory Efficient Hash Tries · Code Specialization for Memory Efficient Hash Tries Michael Steindorfer, Jurgen Vinju ... based on the hash code prefixes](https://reader030.vdocuments.mx/reader030/viewer/2022040213/5e9d10904f3a966b7856cbde/html5/thumbnails/30.jpg)
best hash distribution ->
worst memory performancebest
30