garbage collection mythbusters simon ritter java technology evangelist

80

Upload: kevin-edwards

Post on 18-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

<Insert Picture Here>

Garbage Collection Mythbusters

Simon RitterJava Technology Evangelist

33

The Goal

Cover the strengths and weaknesses of garbage collection

What GC does wellAnd what not so well

44

Tracing GC: A Refresher Course

• Tracing-based garbage collectors:• Discover the live objects

• All objects transitively reachable from a set of “roots”

(“roots” - known live references that exist outside the heap, e.g., thread stacks, virtual machine data)

• Deduce that the rest are dead• Reclaim them

• An “indirect” GC technique• Examples

• Mark & Sweep• Mark & Compact• Copying

55

Tracing GC Example

B

Heap

C

D

G

H

IE

K

MJ

A

F

Runtime

Stack

L

66

Tracing GC Example

The Runtime Stack is considered live by default. We starttracing transitively from it and mark objects we reach.

B

Heap

C

D

G

H

IE

K

MJ

A

F

Runtime

Stack

L

77

Tracing GC Example

B

Heap

C

D

G

H

IE

K

MJ

A

F

RuntimeStack

L

88

Tracing GC Example

B

Heap

C

D

G

H

IE

K

MJ

A

F

RuntimeStack

L

99

Tracing GC Example

We identified all reachable objects. We can deduce thatthe rest are unreachable and, therefore, dead.

B

Heap

C

D

G

H

IE

K

MJ

A

F

Runtime

Stack

L

1010

Alternative 1: In-place Deallocation

Keep track of the free space explicit (using free lists,a buddy system, a bitmap, etc.).

Heap

C

H

IE

K

L

A

RuntimeStack

B

1111

Alternative 2: Sliding Compaction

Slide all live objects to one end of the Heap. All free spaceis located at the other end of the heap.

Heap

C H

IE

K

L

A

RuntimeStack

B

1212

Copying GC Example

B

Heap (To-Space)

C

D

G

H

IE

KARuntim

eStack

L

Heap (From-Space)

1313

Copying GC Example

B

Heap (To-Space)

C

D

G

H

IE

K

A

Runtime

Stack

L

A

Heap (From-Space)

C

1414

Copying GC Example

LB

Heap (To-Space)

C

D

G

H

IE

KA

Runtime

Stack

Heap (From-Space)

A

H

C K

1515

Copying GC Example

L B

Heap (To-Space)

C

D

G

H KA

Runtime

Stack

LIE

Heap (From-Space)

A

H

C K

B

1616

Copying GC Example

EL B

Heap (To-Space)

C

D

G

H

I

KA

Runtime

Stack

IE

Heap (From-Space)

L

A

H

C K

B

1717

Copying GC Example

EL B

C H

I

KA

Heap (To-Space)

Runtime

Stack

Heap (From-Space)

1818

Myth 1:

malloc/free always perform better than GC.

1919

Object Relocation

• GC enables object relocation, which in turn enables• Compaction: eliminates fragmentation• Generational GC: decreases GC overhead• Linear Allocation: best allocation performance

• Fast path: ~10 native instructions, inlined, no sync

free space

end

new top

new object

top

used space

2020

Generational GC is Fast!

• Compare costs (first-order approximation)• malloc/free:

• all_objects * costmalloc + freed_objects * costfree

• Generational GC with copying young generation:• all_objects * costlinear_alloc + surviving_objects * costcopy

• Consider:• costlinear_alloc much less than costmalloc

• surviving_objects often 5% or less of all_objects

2121

GC vs. malloc Study

• Recent publication shows• When space is tight

• malloc/free outperform GC• When space is ample

• GC can match (or better) malloc/free

• GC just as fast• if given “breathing room”

• Matthew Hertz and Emery Berger• Quantifying the Performance of Garbage Collection vs.

Explicit Memory Management, In Proceedings of OOPSLA 2005, October 2005

2222

Object Relocation: Other Benefits

• Compaction: can improve page locality• Fewer TLB misses• Cluster objects to improve locality

• Important on NUMA architectures

• Relocation ordering: can improve cache locality• Fewer cache misses

• The important points:• Allocation and reclamation are fast• Relocation can boost application performance

2323

Myth 1:

malloc/free always perform better than GC.

2424

Myth 1:

malloc/free always perform better than GC.

Busted!

2525

Myth 2:

Reference counting would solve all my GC problems.

2626

Reference Counting

• Each object holds a count• How many references point to it• Increment it when a new reference points to the object• Decrement it when a reference to the object is dropped

• When reference count reaches 0• Object is unreachable• It can be reclaimed

• A “direct” GC technique

2727

Reference Counting Example

Heap

C

H

IE

K

L

A

Runtime

Stack

MJF

B2

1

1 1

1

1

1 11

2

1

2828

Reference Counting Example

MJF

Heap

C

H

IE

K

L

A

Runtime

Stack

B

Delete reference,Decrease H's RC

2

1

1 1

1

1

1 11

2

1

2929

Reference Counting Example

MJF

Heap

C

H

IE

K

L

A

Runtime

Stack

B2

1

1 1

0

1

1 11

2

1

3030

Reference Counting Example

MJF

Heap

C

H

IE

K

L

A

Runtime

Stack

B

Decrease K's & L's RCs,Reclaim H

2

1

1 1

0

1

1 11

2

1

3131

Reference Counting Example

MJF

Heap

C

IE

K

L

A

Runtime

Stack

B1

1

1 1

0

1 11

2

1

3232

Reference Counting Example

MJF

Heap

C

IE

K

L

A

Runtime

Stack

B1

1

1 1

0

1 11

2

1

Reclaim K

3333

Reference Counting Example

MJF

Heap

C

IE L

A

Runtime

Stack

B1

1

1 1

1 11

2

1

3434

Traditional Reference Counting

• Extra space overhead• One reference count per object

• Extra time overhead• Up to two reference count updates per reference field update• Very expensive in a multi-threaded environment

• Non-moving• Fragmentation

• Not always incremental or prompt• Garbage cycles

• Counts never reach 0• Cannot be reclaimed

3535

Reference Counting Example

Objects F and J form a garbage cycle and also retain M too.

MJF

Heap

C

IE L

A

RuntimeStack

B1

1

1 1

1 11

2

1

3636

Advanced Reference Counting

• Two-bit reference counts• Most objects pointed to by one or two references• When max count (3) is reached

• Object becomes “sticky”

• Buffer reference updates• Apply them in bulk

• Combine with copying GC• Use a backup GC algorithm

• Handle cyclic garbage• Deal with “sticky” objects• Typically, the cyclic GC is a tracing GC

• Complex, and still non-moving

3737

Myth 2:

Reference counting would solve all my GC problems.

3838

Myth 2:

Reference counting would solve all my GC problems.

Busted!

3939

Myth 3:

GC with explicit deallocation would drastically improve performance.

4040

GC with Explicit Deallocation?

• Philosophically• Would compromise safety

• Practically• Not all GC algorithms can support it• Mark-Compact & Copying GCs

• Do not maintain free lists• Reclaim space by moving live objects

• Overwrite reclaimed objects• No way to reuse space from a single object

• Unless the object is at the end of the heap

4141

GC with Explicit Deallocation? (ii)

• Explicit deallocation is incompatible with this model• Would compromise the very fast allocation path

• GCs have a different reclamation pattern• Reclaim objects in bulk• Free-space management is optimized for that

• Also applies to static analysis techniques• They can prove that an object can be safely deallocated• …but there is no mechanism to do the deallocation!

4242

How to deallocate?

• How can we deallocate the dead object when we only maintain top?

top

end

dead object

free spaceused space

4343

Myth 3:

GC with explicit deallocation would drastically improve performance.

4444

Myth 3:

GC with explicit deallocation would drastically improve performance.

Busted!

4545

Myth 4:

Finalizers can (and should) be called as soon as objects become unreachable.

4646

Finalizers

• Typical use of Finalizers:• Reclaim external resources associated with objects in heap

• e.g., native GUI components (windows, color maps, etc.)

• Finalizers are called on objects that GC has found to be garbage

• Tracing GC• Does not always have liveness information for every object• Liveness information up to date only at certain points

• Immediately after a tracing cycle• Must finish a tracing cycle to find finalizable objects

4747

Finalization Reality Check

• Finalizers are not like C++ destructors• No guarantees

• When they will be run• Which thread will run them• Which will run first, second, … last• Or that they will be run at all!

• If you want prompt external resource reclamation• Don't rely on finalizers• Dispose explicity instead• Use finalization as a safety net

4949

Myth 4:

Finalizers can (and should) be called as soon as objects become unreachable.

5050

Myth 4:

Finalizers can (and should) be called as soon as objects become unreachable.

Busted!

5151

Myth 5:

Garbage collection eliminates all memory leaks

5252

Unused Reachable Objects

• Consider the following code:

class ImageMap { private Map<File, Image> map; public void add(File file, Image img) { map.put(file, img); } public Image get(File file){ return map.get(file); } public void remove(File file) { map.remove(file); }}static ImageMap imageMap;…File f = new File(imageFileName);Image img = readImage(f);imageMap.add(f, img);f = null;

5353

GC and Memory Leaks

• Consider the (f, img) tuple in the previous example• After we null f, the tuple is unused

• We cannot retrieve it (don't have the key any more)• We cannot remove it (don't have the key any more)

• So (f, img) will take up space while imageMap is alive• … without the application being able to access it

• GC reclaims unreachable objects• But not unused objects that are reachable• And it cannot know when a reachable object is unused

5454

GC and Memory Leaks (ii)

• Effort required to track down such leaks• Can't override malloc anymore

• Tools are needed to help• Heap population statistics (what is being retained?)• Reachability information (why is it being retained?)

5555

Myth 5:

Garbage collection eliminates all memory leaks.

5656

Myth 5:

Garbage collection eliminates all memory leaks.

Busted!

5757

Myth 6:

I can get a GC that delivers very high throughput and very low latency.

5858

Throughput vs. Latency

• For most applications, GC overhead is small• 2% – 5%

• Throughput GCs• Move most work to GC pauses• Application threads do as little as possible• Least overall GC overhead

• Low-latency GCs• Move work out of GC pauses• Application threads do more work

• Bookkeeping for GC more expensive• More overall GC overhead

5959

Throughput vs. Latency (ii)

• Goals are conflicting• GCs are architected differently• One GC does not rule them all• Must choose the best GC for the job

• Also consider another dimension:• Footprint

• Why can't the VM choose the right GC?• Impossible to know application priorities• Hints may help

• …but for now, human must decide

6060

Myth 6:

I can get a GC that delivers very high throughput and very low latency.

6161

Myth 6:

I can get a GC that delivers very high throughput and very low latency.

Busted!

6262

Myth 7:

I need to disable GC in criticalsections of my code.

6363

Why Disable GC?

• Application has a critical deadline• Display a video frame• Complete a stock trade• Adjust nuclear reactor control rods

• GC pause may cause deadline to be missed• Jittery video, missed profit, boom!

• So simply ...• Disable GC• Run without interruptions• Viola! Meet your deadline

6464

Coding Without GC

• GC typically occurs because heap is full / nearly full• No GC → no allocation• Code in critical section cannot allocate safely

• Possible solution: Allocate in advance• Only access pre-allocated objects• Prohibit allocations during the critical section

• How? Throw exception? Code analysis?

• Must know exactly what data is needed• Before entering critical section• Must audit every change to critical section

6565

Using Libraries

• Libraries freely allocate objects• And with good reason

• Clear programming model• Lack of side-effects good for concurrency

• Can you always avoid using them in critical sections?• Concatenate two strings• Use the concurrency libraries• Add a new element to a collection• etc.

6666

A Few More Problems

• Other threads• Cannot allocate either• Stop them all?

• What if they hold a lock you need? → deadlock• Overlapping critical sections

• Can never do GC!!!

• Abuse• Long / unpredictable critical sections

• blocking I/O,waiting for a lock, etc.• Libraries that have critical sections

• We have seen many libraries that call System.gc()• -XX:+DisableCriticalSections?

6767

The Bottom Line

• It might work in very few, limited cases• Not a general-purpose solution

• Too many ways to shoot yourself in the foot

• You should really consider looking at the RTSJ• Real-Time Specification for Java• But it also has a lot of the same problems we just described

6868

Myth 7:

I need to disable GC in criticalsections of my code.

6969

Myth 7:

I need to disable GC in criticalsections of my code.

Busted!

7070

Myth 8:

GC settings that worked for my last app will also work for my next app.

7171

What Affects GC Performance?

• Application behavior• Allocation rate

• Higher allocation rate → more frequent GCs• Live data size

• More live data → longer tracing cycles• Mutation rate

• Higher mutation rate → more load on the write barriers, hence more load on incremental GCs

• Hardware• Number of cores, clock rate, total RAM, cache sizes

• GC tuning parameters

7272

What Affects GC Performance? (ii)

• Primary factors• App behavior, hardware, tuning parameters

• Keep those factors constant• GC should have consistent performance

• Change any of them …• Examples:

• Increase object lifetimes → increase GC time and/or frequency

• Increase average object size → increase copying costs• Move to faster hardware → allocate more objects

• GC performance will change, too

7373

GC Tuning Parameters

• Yet, we often see ...• Customers move tuning parameters from one app to another

• Transferring parameters• Mixed results at best• Sometimes it works

• Maybe when defaults are really bad!• Usually it doesn't

• Mostly luck• Performance often left on the table

• If applications are very similar• e.g., version 2 of the same app• Use previous tuning parameters only as a starting point

7474

Guaranteed GC Performance?

• In theory: yes• Real-Time GCs are available• But they require strict bounds on application characteristics

• Modern applications• Very large, very complex, very dynamic• Virtually impossible to analyze them

• At least, to get realistic bounds• At best, approximations based on testing

• Realistically: no hard real-time guarantees• For non-trivial applications• Soft real-time at best…

7575

Myth 8:

GC settings that worked for my last app will also work for my next app.

7676

Myth 8:

GC settings that worked for my last app will also work for my next app.

Busted!

7777

Myth 9:

7878

Myth 9:

This talk is over.

7979

Myth 9:

This talk is over.

Confirmed!

8080

The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.