department of computer sciences ismm 20081 no bit left behind: the limits of heap data compression...
TRANSCRIPT
![Page 1: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/1.jpg)
ISMM 2008 1
Department of Computer Sciences
No Bit Left Behind: The Limits of Heap Data
Compression
Jennifer B. Sartor*Martin Hirzel†, Kathryn S.
McKinley**U Texas at Austin, †IBM Watson
![Page 2: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/2.jpg)
Department of Computer Sciences
ISMM 2008 2
Current State Managed languages ubiquitous
Embedded devices
Multicore
Need memory efficiency!
CPU L1
L2
CPUCPU
L1L1
L2
![Page 3: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/3.jpg)
Department of Computer Sciences
ISMM 2008 3
Memory Efficiency of Managed Languages
X COSTX 8-94% information content in heap in 37
benchmarks. [Mitchell & Sevitsky, OOPSLA 07]X Boxed objectsX Trailing zeros in arraysX Redundant objectsX Extra bit-widthX Data structure back-bones
bzip2
86% OPPORTUNITY Memory layout abstraction (Location + size) != identity
![Page 4: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/4.jpg)
Department of Computer Sciences
ISMM 2008 4
Related WorkAnanian & Rinard. LCTES 03 Dom value field hash
Appel & Goncalves. Tech Report 93 Eql obj sharing, Const field elide, Bit-width reduction
Chen, Kandemir & Irwin. VEE 05 Dom value field elide
Chen, et al. OOPSLA 03 Zero compr, Trail zero trim
Cooprider & Regehr. PLDI 07 Value set indirection
Marinov & O’Callahan. OOPSLA 03 Eql obj sharing
Stephenson, Babb & Amarasinghe. PLDI 00
Const field elide, Bit-width reduction
Titzer, et al. PLDI 07 Value set indirection
Zilles. ISMM 07 Bit-width reduction
![Page 5: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/5.jpg)
Department of Computer Sciences
ISMM 2008 5
Limit Study
Quantitatively compare heap data compression Surveyed literature Savings equations Methodology for evaluation Apples-to-apples comparison Future work: implementation
Hybrid techniques
Findings: array & hybrid compression
58%
![Page 6: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/6.jpg)
Department of Computer Sciences
ISMM 2008 6
Hybrid Array Compression
x0001 x0001 x0058 x0001 x0004 x0001 x0000 x0001
Redundancy Equal array sharing
x0001 x0001 x0058 x0001 x0004 x0001 x0000 x0001
![Page 7: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/7.jpg)
Department of Computer Sciences
ISMM 2008 7
Equal Object Sharing Marinov & O’Callahan. OOPSLA 03;
Appel & Goncalves. Tech Report 93
Two objects are equal if both Same class & all fields have same
value Strictly-equal: pointer fields identical Deep: objects pointer targets are equal
JVM store only 1 copy in hashtable
€
(N −D)×sizeof (C)−hashTableSize(D,pntrSize)
14%
Class C, N objects, D distinct; save:
![Page 8: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/8.jpg)
Department of Computer Sciences
ISMM 2008 8
Hybrid Array Compression
x0001 x0001 x0058 x0001 x0004 x0001 x0000 x0001
Redundancy Equal array sharing Value set indirection
x0001 x0001 x0058 x0001 x0004 x0001 x0000 x0001
Dictionary: x0001 x0058 x0004 x0000
0 0 1 0 2 0 3 0
![Page 9: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/9.jpg)
Department of Computer Sciences
ISMM 2008 9
Value Set Indirection & Caching
Cooprider & Regehr/ Titzer, et al. PLDI 07 For object field or array elements
with large range of values Dictionary (or cache) of 256 most frequent
values, instance stores small 1 byte indices
14%
If > 256 values, 255 in dictionary, 256th says to store rest (M) in hashtable w/ objectID
€
a.length×(sizeof (T)−1)
a∈T[]
∑ −arrayHdrSize−256×sizeof (T)
−hashTableSiz ′ e (M,sizeof (T))
![Page 10: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/10.jpg)
Department of Computer Sciences
ISMM 2008 10
Hybrid Array Compression 2
x00A0 x0073 x0002 x0001 x0101 x0000 x0000 x0000
Remove zeros Trim trailing zeros
Bit width reduce
Zero compress
x00A0 x0073 x0002 x0001 x0101 8 5
x0A0 x073 x002 x001 x101 8 5
x0A x73 x2 x001 x101 8 5 101011118 5 xAF
![Page 11: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/11.jpg)
Department of Computer Sciences
ISMM 2008 11
Zero-based Object Compression
Chen, et al. OOPSLA 03 Remove bytes that are entirely
zero Per object bit-map: 1 bit per
byte Store only non-zero bytes
45%
Savings:
€
zeroBytes(o)− totalBytes(o)8
⎡
⎢
⎢ ⎢ ⎢ ⎢
⎤
⎥
⎥ ⎥ ⎥ ⎥o∈objects
∑
![Page 12: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/12.jpg)
Department of Computer Sciences
ISMM 2008 12
Hybrid Array Compression 2
x00A0 x0073 x0002 x0001 x0101 x0000 x0000 x0000
Remove zeros Trim trailing zeros
Bit width reduce
Zero compress
x00A0 x0073 x0002 x0001 x0101 8 5
x0A0 x073 x002 x001 x101 8 5
x0A x73 x2 x001 x101 8 5 xAF
![Page 13: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/13.jpg)
Department of Computer Sciences
ISMM 2008 13
Methodology
Program run Heap dumpseries
Analysisrepresentation
t
Model 1
–Model n
…s
Limit savings
Garbage
Collection
snapshot
![Page 14: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/14.jpg)
Department of Computer Sciences
ISMM 2008 14
Experimental Details Jikes Research Virtual Machine
Java-in-Java DaCapo benchmarks + pseudojbb 20-25 heap snapshots per benchmark
MarkSweep with 2x min heap Analysis
Per class Objects and arrays separated JVM+app vs application (separated in
paper) Per heap snapshot, and over all snapshots
![Page 15: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/15.jpg)
Department of Computer Sciences
ISMM 2008 15
Technique Class Array GC/RunLempel-Ziv compression X GC
Strictly-equal object sharing Obj Type GC
Deep-equal object sharing Obj Type GC
Zero-based object compression Obj Inst GC
Trailing zero array trimming Inst GC
Bit-width reduction Fld Inst GC/Run
Dominant-value field hashing Fld GC
Lazy invariant computation Fld GC
Value set indirection Fld Type GC
Value set caching Fld Type GC
Constant field elision Fld Run
Dominant-value field elision Fld Run
![Page 16: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/16.jpg)
Department of Computer Sciences
ISMM 2008 16
Value Indirection & Cache
Deep Equal Sharing
Zero Compression
Hybrid Compression
![Page 17: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/17.jpg)
Department of Computer Sciences
ISMM 2008 17
Stability of Savings
0%
5%
10%
15%
20%
25%
combArr
maxArr
zeroArr
combCls
maxCls
bitwArr
zeroCls
hashFld
indirFld
cacheArr
strEqArr
bitwFld
charArr
trailArr
strEqCls
lazyFld
cacheFld
indirArr
boolArr
fop: snapshots over time
![Page 18: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*](https://reader036.vdocuments.mx/reader036/viewer/2022070413/5697bfac1a28abf838c9b5c9/html5/thumbnails/18.jpg)
Department of Computer Sciences
ISMM 2008 18
Conclusions Limit study compare apples-to-
apples heap data compression techniques
Potential to reduce memory inefficiencies in managed languages Arrays Hybrids
Future: save space Challenge: efficient detection &
recovery
Thank you!