garbage collection mythbusters simon ritter java technology evangelist
TRANSCRIPT
33
The Goal
Cover the strengths and weaknesses of garbage collection
What GC does wellAnd what not so well
44
Tracing GC: A Refresher Course
• Tracing-based garbage collectors:• Discover the live objects
• All objects transitively reachable from a set of “roots”
(“roots” - known live references that exist outside the heap, e.g., thread stacks, virtual machine data)
• Deduce that the rest are dead• Reclaim them
• An “indirect” GC technique• Examples
• Mark & Sweep• Mark & Compact• Copying
66
Tracing GC Example
The Runtime Stack is considered live by default. We starttracing transitively from it and mark objects we reach.
B
Heap
C
D
G
H
IE
K
MJ
A
F
Runtime
Stack
L
99
Tracing GC Example
We identified all reachable objects. We can deduce thatthe rest are unreachable and, therefore, dead.
B
Heap
C
D
G
H
IE
K
MJ
A
F
Runtime
Stack
L
1010
Alternative 1: In-place Deallocation
Keep track of the free space explicit (using free lists,a buddy system, a bitmap, etc.).
Heap
C
H
IE
K
L
A
RuntimeStack
B
1111
Alternative 2: Sliding Compaction
Slide all live objects to one end of the Heap. All free spaceis located at the other end of the heap.
Heap
C H
IE
K
L
A
RuntimeStack
B
1515
Copying GC Example
L B
Heap (To-Space)
C
D
G
H KA
Runtime
Stack
LIE
Heap (From-Space)
A
H
C K
B
1616
Copying GC Example
EL B
Heap (To-Space)
C
D
G
H
I
KA
Runtime
Stack
IE
Heap (From-Space)
L
A
H
C K
B
1919
Object Relocation
• GC enables object relocation, which in turn enables• Compaction: eliminates fragmentation• Generational GC: decreases GC overhead• Linear Allocation: best allocation performance
• Fast path: ~10 native instructions, inlined, no sync
free space
end
new top
new object
top
used space
2020
Generational GC is Fast!
• Compare costs (first-order approximation)• malloc/free:
• all_objects * costmalloc + freed_objects * costfree
• Generational GC with copying young generation:• all_objects * costlinear_alloc + surviving_objects * costcopy
• Consider:• costlinear_alloc much less than costmalloc
• surviving_objects often 5% or less of all_objects
2121
GC vs. malloc Study
• Recent publication shows• When space is tight
• malloc/free outperform GC• When space is ample
• GC can match (or better) malloc/free
• GC just as fast• if given “breathing room”
• Matthew Hertz and Emery Berger• Quantifying the Performance of Garbage Collection vs.
Explicit Memory Management, In Proceedings of OOPSLA 2005, October 2005
2222
Object Relocation: Other Benefits
• Compaction: can improve page locality• Fewer TLB misses• Cluster objects to improve locality
• Important on NUMA architectures
• Relocation ordering: can improve cache locality• Fewer cache misses
• The important points:• Allocation and reclamation are fast• Relocation can boost application performance
2626
Reference Counting
• Each object holds a count• How many references point to it• Increment it when a new reference points to the object• Decrement it when a reference to the object is dropped
• When reference count reaches 0• Object is unreachable• It can be reclaimed
• A “direct” GC technique
2828
Reference Counting Example
MJF
Heap
C
H
IE
K
L
A
Runtime
Stack
B
Delete reference,Decrease H's RC
2
1
1 1
1
1
1 11
2
1
3030
Reference Counting Example
MJF
Heap
C
H
IE
K
L
A
Runtime
Stack
B
Decrease K's & L's RCs,Reclaim H
2
1
1 1
0
1
1 11
2
1
3434
Traditional Reference Counting
• Extra space overhead• One reference count per object
• Extra time overhead• Up to two reference count updates per reference field update• Very expensive in a multi-threaded environment
• Non-moving• Fragmentation
• Not always incremental or prompt• Garbage cycles
• Counts never reach 0• Cannot be reclaimed
3535
Reference Counting Example
Objects F and J form a garbage cycle and also retain M too.
MJF
Heap
C
IE L
A
RuntimeStack
B1
1
1 1
1 11
2
1
3636
Advanced Reference Counting
• Two-bit reference counts• Most objects pointed to by one or two references• When max count (3) is reached
• Object becomes “sticky”
• Buffer reference updates• Apply them in bulk
• Combine with copying GC• Use a backup GC algorithm
• Handle cyclic garbage• Deal with “sticky” objects• Typically, the cyclic GC is a tracing GC
• Complex, and still non-moving
4040
GC with Explicit Deallocation?
• Philosophically• Would compromise safety
• Practically• Not all GC algorithms can support it• Mark-Compact & Copying GCs
• Do not maintain free lists• Reclaim space by moving live objects
• Overwrite reclaimed objects• No way to reuse space from a single object
• Unless the object is at the end of the heap
4141
GC with Explicit Deallocation? (ii)
• Explicit deallocation is incompatible with this model• Would compromise the very fast allocation path
• GCs have a different reclamation pattern• Reclaim objects in bulk• Free-space management is optimized for that
• Also applies to static analysis techniques• They can prove that an object can be safely deallocated• …but there is no mechanism to do the deallocation!
4242
How to deallocate?
• How can we deallocate the dead object when we only maintain top?
top
end
dead object
free spaceused space
4646
Finalizers
• Typical use of Finalizers:• Reclaim external resources associated with objects in heap
• e.g., native GUI components (windows, color maps, etc.)
• Finalizers are called on objects that GC has found to be garbage
• Tracing GC• Does not always have liveness information for every object• Liveness information up to date only at certain points
• Immediately after a tracing cycle• Must finish a tracing cycle to find finalizable objects
4747
Finalization Reality Check
• Finalizers are not like C++ destructors• No guarantees
• When they will be run• Which thread will run them• Which will run first, second, … last• Or that they will be run at all!
• If you want prompt external resource reclamation• Don't rely on finalizers• Dispose explicity instead• Use finalization as a safety net
5252
Unused Reachable Objects
• Consider the following code:
class ImageMap { private Map<File, Image> map; public void add(File file, Image img) { map.put(file, img); } public Image get(File file){ return map.get(file); } public void remove(File file) { map.remove(file); }}static ImageMap imageMap;…File f = new File(imageFileName);Image img = readImage(f);imageMap.add(f, img);f = null;
5353
GC and Memory Leaks
• Consider the (f, img) tuple in the previous example• After we null f, the tuple is unused
• We cannot retrieve it (don't have the key any more)• We cannot remove it (don't have the key any more)
• So (f, img) will take up space while imageMap is alive• … without the application being able to access it
• GC reclaims unreachable objects• But not unused objects that are reachable• And it cannot know when a reachable object is unused
5454
GC and Memory Leaks (ii)
• Effort required to track down such leaks• Can't override malloc anymore
• Tools are needed to help• Heap population statistics (what is being retained?)• Reachability information (why is it being retained?)
5858
Throughput vs. Latency
• For most applications, GC overhead is small• 2% – 5%
• Throughput GCs• Move most work to GC pauses• Application threads do as little as possible• Least overall GC overhead
• Low-latency GCs• Move work out of GC pauses• Application threads do more work
• Bookkeeping for GC more expensive• More overall GC overhead
5959
Throughput vs. Latency (ii)
• Goals are conflicting• GCs are architected differently• One GC does not rule them all• Must choose the best GC for the job
• Also consider another dimension:• Footprint
• Why can't the VM choose the right GC?• Impossible to know application priorities• Hints may help
• …but for now, human must decide
6363
Why Disable GC?
• Application has a critical deadline• Display a video frame• Complete a stock trade• Adjust nuclear reactor control rods
• GC pause may cause deadline to be missed• Jittery video, missed profit, boom!
• So simply ...• Disable GC• Run without interruptions• Viola! Meet your deadline
6464
Coding Without GC
• GC typically occurs because heap is full / nearly full• No GC → no allocation• Code in critical section cannot allocate safely
• Possible solution: Allocate in advance• Only access pre-allocated objects• Prohibit allocations during the critical section
• How? Throw exception? Code analysis?
• Must know exactly what data is needed• Before entering critical section• Must audit every change to critical section
6565
Using Libraries
• Libraries freely allocate objects• And with good reason
• Clear programming model• Lack of side-effects good for concurrency
• Can you always avoid using them in critical sections?• Concatenate two strings• Use the concurrency libraries• Add a new element to a collection• etc.
6666
A Few More Problems
• Other threads• Cannot allocate either• Stop them all?
• What if they hold a lock you need? → deadlock• Overlapping critical sections
• Can never do GC!!!
• Abuse• Long / unpredictable critical sections
• blocking I/O,waiting for a lock, etc.• Libraries that have critical sections
• We have seen many libraries that call System.gc()• -XX:+DisableCriticalSections?
6767
The Bottom Line
• It might work in very few, limited cases• Not a general-purpose solution
• Too many ways to shoot yourself in the foot
• You should really consider looking at the RTSJ• Real-Time Specification for Java• But it also has a lot of the same problems we just described
7171
What Affects GC Performance?
• Application behavior• Allocation rate
• Higher allocation rate → more frequent GCs• Live data size
• More live data → longer tracing cycles• Mutation rate
• Higher mutation rate → more load on the write barriers, hence more load on incremental GCs
• Hardware• Number of cores, clock rate, total RAM, cache sizes
• GC tuning parameters
7272
What Affects GC Performance? (ii)
• Primary factors• App behavior, hardware, tuning parameters
• Keep those factors constant• GC should have consistent performance
• Change any of them …• Examples:
• Increase object lifetimes → increase GC time and/or frequency
• Increase average object size → increase copying costs• Move to faster hardware → allocate more objects
• GC performance will change, too
7373
GC Tuning Parameters
• Yet, we often see ...• Customers move tuning parameters from one app to another
• Transferring parameters• Mixed results at best• Sometimes it works
• Maybe when defaults are really bad!• Usually it doesn't
• Mostly luck• Performance often left on the table
• If applications are very similar• e.g., version 2 of the same app• Use previous tuning parameters only as a starting point
7474
Guaranteed GC Performance?
• In theory: yes• Real-Time GCs are available• But they require strict bounds on application characteristics
• Modern applications• Very large, very complex, very dynamic• Virtually impossible to analyze them
• At least, to get realistic bounds• At best, approximations based on testing
• Realistically: no hard real-time guarantees• For non-trivial applications• Soft real-time at best…
8080
The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.