how big is your data

Post on 20-Aug-2015

432 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

WHO AM I

Master Developer at Plumbr

We solve memory leaks ... for now

Giving you the exact location of the leak with enough information to fix it

The foundation is based on machine learning

Ongoing effort

Monday, April 1, 13

MORE PLUMBR

trained on 500,000 memory snapshots

From 3,000 different applications

Finding 88% of the existing leaks.

20,000 monthly unique visitors in our site

400 monthly downloads

1700+ leaks discovered

Monday, April 1, 13

AGENDA

What’s the deal?

How to measure

Primitives

References

Collections

Monday, April 1, 13

INTRO

How much space is needed to store 10M integers in a set in Java?

Hint: 4 bytes * 10 000 000 = 40 MB

Monday, April 1, 13

SHALLOW VS DEEP

You can measure shallow size of the object

Or deep size of the subgraph starting with the object

Or retained size of the subgraph dominated by the object

Monday, April 1, 13

RETAINED SIZE

r(O1)=O1+O2+O3+O4

r(O2)=O2

r(O3)=O3+O4

r(O4)=O4

Monday, April 1, 13

SIZE OF AN OBJECT

An overhead of being an Object

Call it object header

constant on a given JVM

Data

primitives

arrays

pointers to other objects

plus all this from superclasses

Monday, April 1, 13

HOW TO MEASURE

Manually, based on JLS/JVM spec

Memory measurer

SizeofAgent

http://sourceforge.net/projects/sizeof/

Javaspecialists.eu issue 142

https://github.com/shipilev/java-object-layout

Monday, April 1, 13

HANDS ON

• Lets measure java.lang.Object

Monday, April 1, 13

ALIGNMENT

8 byte alignment

Monday, April 1, 13

JAVA.LANG.OBJECT

8 bytes @ 32 bit JVM

12 bytes @ 64 bit JVM

Monday, April 1, 13

PRIMITIVESType Java Language Spec (JLS)

byte 1 byte

short 2 bytes

int 4 bytes

long 8 bytes

char 2 bytes

float 4 bytes

double 8 bytes

boolean 1 bit

Monday, April 1, 13

HANDS ON

Lets measure primitives

Monday, April 1, 13

PRIMITIVES

Type JLS JVM cost

byte 1 byte 1..8 bytes

short 2 bytes 2..8 bytes

int 4 bytes 4..8 bytes

long 8 bytes 8 bytes

char 2 bytes 2..8 bytes

float 4 bytes 4..8 bytes

double 8 bytes 8 bytes

boolean 1 bit 1..8 bytes

Monday, April 1, 13

WRAPPER OBJECTS

Type JLS JVM cost Wrapper*

byte 1 byte 1..8 bytes 16 bytes

short 2 bytes 2..8 bytes 16 bytes

int 4 bytes 4..8 bytes 16 bytes

long 8 bytes 8 bytes 24 bytes

char 2 bytes 2..8 bytes 16 bytes

float 4 bytes 4..8 bytes 16 bytes

double 8 bytes 8 bytes 24 bytes

boolean 1 bit 1..8 bytes 16 bytes

* 64 bit JVM objects, thus adding 12 bytes

Monday, April 1, 13

INTERMISSION

10 000 000 ints as Integer objects =

10 000 000 * 16 = 160 000 000 bytes

160 MB of stuff worth of 40 MB of data!

Monday, April 1, 13

OBJECT REFERENCES: QUIZ

What is the size of an instance of this class:

16 bytes?

24 bytes?

32 bytes?

class With2Members { Object x = null; Object y = null}

Monday, April 1, 13

QUIZ: ANSWER

32 bit

8 + 4 + 4 = 16

64 bit +CompressedOOPs (Xmx < 32g)

12 + 4 + 4 = 20 (align) → 24

64 bit –CompressedOOPs (Xmx > 32g)

12 + 8 + 8 = 28 (align) → 32

Monday, April 1, 13

FLYWEIGHTS

Flyweight pattern

<PrimitiveWrapper>.valueOf()

Byte

Short

Integer

Long

Character

String.intern()

Only 1 byte is cached

Monday, April 1, 13

BACK TO COLLECTIONS

Lets run intro again.

Collection (10 M ints) Overhead

Pure data 0

int[] ~0

Integer[] 5x (200M)

Integer[] (valueOf) <5x (200M)

ArrayList<Integer>(10M) <5x (200M)

ArrayList<Integer>() 5.15x (205M)

HashSet<Integer>() 13.7x (547M)

HashSet<Integer>(10M) 13.7x (547M)

Monday, April 1, 13

TROVE

Collection (10 M ints) Size

Pure data 40 000 000 (40M)

TIntArrayList ~1.05x (42M)

TIntArrayList(10M) ~0 (40M)

TIntHashSet ~3.3x (131M)

TIntHashSet(10M) ~2.6x (105M)

Monday, April 1, 13

MORE

Collection (10 M ints) Size

Pure data 40 000 000 (40M)

fastutil IntOpenHashSet ~2.1x (83M)

org.a.c.c.p. ArrayIntList ~1.4x (55M)

hppc.IntIntOpenHashMap ~3.8x (150M)

cern.colt.map.OpenIntIntHashMap ~6.5x (260M)

Monday, April 1, 13

top related