© richard jonesismm 2008, tucson, az 000 a study of java object demographics richard jones chris...

22
© Richard Jones ISMM 2008, Tucson, AZ 001 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

Upload: rosanna-glenn

Post on 03-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 001

A Study of Java Object DemographicsA Study of Java Object Demographics

Richard Jones

Chris RyderComputing Laboratory

University of Kent

Page 2: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 002

Overview

Motivation & Contribution

Object demographics examples

Data capture

Clustering

Program inputs

Calling context

Related work

Page 3: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 003

Tomorrow Never Dies?

Object segregation• Generations• Older-First• Immortal region• Large object areas

Idea: • Segregate objects by age,

size, type, mortality, etc.• Collect regions under

different policies and mechanisms.

Choice of GC• Select the best GC for the

application a priori.• Hot-swap running GC.

Idea: • Different applications have

different demographics.• Respond to phase changes.

Exploiting program behaviour

Page 4: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 004

Dr. No ‘one size fits all’

Most systems manage objects uniformly• E.g. allocate all objects in a nursery and collect all nursery objects at

the same time, promoting to the same older generation.

Pre-tenuring GC uses a very simple classification• E.g. short-lived, long-lived, immortal.

Contributions

A detailed study of Java object demographics reveals• A richer landscape than short/long/immortal.• Distinct behaviour of application, library and JVM objects.• Clusters of allocation sites, stable across program inputs.• A small number of clusters dominate.• Context is an important predictor for library allocation.

Page 5: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 005

The Living Daylights

Compiles JavaLex scanner 4 times.

Allocates [char] for a GNU classpath internal String constructor.

6% of total allocation.

Compiles JavaLex scanner 4 times.

85% of these objects very short-lived.

A few are immortal.

Some survive to the end of the phase.

A few are long lived.

Age

ToD

No go areaToD < Age

Lifespan

_213_javac, speed 100

Page 6: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 006

DaCapo hsqldb, default input.

4 sites

17% volume

95% space rental• [volume x lifetime]

Scarcely any objects are very short-lived.

Die Another Day

Page 7: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 007

Live And Let Die

DaCapo fop,default input

18 sites

19% volume• 8.29% short-lived• 9.27% immortal

16% space rental

Page 8: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 008

For Your Eyes Only

MemTrace compiles method to…• Record allocation sites.• Modify allocation routines.

– Tag object header with site & position in calling context tree.

– Emit allocation record.• Benefit: same framework as for method specialisation [ISMM07].

MemTrace profiles using…• Baseline compiler — focus on application objects.• Forced full collections (64K granularity).

– GCspy framework to log death events.

– Exaggerates lifetimes of short-lived objects.

Page 9: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 009

Casino Royale

Aim• Characterise lifetimes of

objects allocated by a site.

• Identify sites with similar lifetimes.

We call the cumulative frequency curve the lifetime distribution function (ldf) of the site.

• Expect collaborating sites to have similar ldf’s.

Page 10: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0010

From Russia With Love

Compare ldf’s statistically for some confidence n%• Kolmogorov-Smirnov Two Sample test• D = the maximum difference between 2 frequency distributions Ei(t)• p(D is significant) < n?• Benefit: non-parametric, distribution-free, cheap.

Page 11: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0011

Thunderball

Greedy, gravitational clustering

Page 12: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0012

Thunderball

Greedy, gravitational clustering

Page 13: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0013

Thunderball

Greedy, gravitational clustering

Page 14: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0014

Thunderball

Greedy, gravitational clustering

Page 15: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0015

DaCapo: all allocation

Immortal clusteralways cluster 0

Page 16: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0016

DaCapo: application packages

Immortal clusteralways cluster 0

Page 17: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0021

You Only Live Twice

Does an allocation site generate the same lifetime behaviour regardless of input?

Do allocation sites share the same cluster from one input to another?

• i.e. continue to behave in the same way as each other? • Compare cluster membership with Adjusted Rand Index

antlr jython pmd psCo Ap Li VM Co Ap Li VM Co Ap Li VM Co Ap Li VM

SD 0.9 0.5 1.0 1.0 0.7 1.0 1.0 1.0 0.9 1.0 1.0 1.0 1.0 1.0 1.0 0.9

SL 0.8 0.4 0.9 1.0 0.7 0.7 1.0 1.0 0.7 1.0 1.0 1.0 0.8 0.8 1.0 0.9

DL 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.8 1.0 1.0 1.0 0.6 0.7 1.0 0.9

Page 18: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0022

The World Is Not Enough?

Earlier studies• Java: site is sufficient [Blackburn et al, OOPSLA01]

• C: more context required [Zorn & Seidl, ASPLOS98]

Calling context• <site+method0, method1, method2, …>• Increasing depth of context splits an ldf into 1 or more.

Compare the variance of site ldf’s• Variance of program = weighted sum of the variances of its ldf’s

Page 19: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0023

ContextContext (2)

All

Jikes RVM Library

Application

Variance as a multiple of depth =

Page 20: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0024

A View To A Kill

Related work: choice of GC• Fitzgerald and Tarditi [ISMM00]. • Hot-swapping: Printezis [JVM01]; Soman, Krintz, Bacon [ISMM04];

Singer, Brown & Watson [ISMM07].• Thomas [Inf Proc Letters '95] tailors GC to the program.

Demographics• Dieckman and Holzle [ECOOP98] focus on reference densities,

proportion of arrays, etc.• DaCapo [OOPSLA06] characterise benchmarks by heap-related

metrics. • Pretenuring - Cheng, Harper, Lee [PLDI98], Harris [ISMM00];

Blackburn et al [TOPLAS07]; Marion, Jones, Ryder [ISMM07].• Merlin [SIGMETRICS02].

Page 21: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0025

Conclusions

No one size of collector fits all.

Programs exhibit only a few distinct object lifetime

distributions.

These are richer than short/long/immortal.

A very small number of clusters dominate.

Clusterings are stable across inputs.

Calling context is important for libraries.

http://www.cs.kent.ac.uk/projects/gc/demographics

Page 22: © Richard JonesISMM 2008, Tucson, AZ 000 A Study of Java Object Demographics Richard Jones Chris Ryder Computing Laboratory University of Kent

© Richard Jones ISMM 2008, Tucson, AZ 0026

Questions?