compilation 2007 garbage collection michael i. schwartzbach brics, university of aarhus

56
Compilation 2007 Compilation 2007 Garbage Collection Garbage Collection Michael I. Schwartzbach BRICS, University of Aarhus

Post on 21-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Compilation 2007Compilation 2007

Garbage CollectionGarbage Collection

Michael I. Schwartzbach

BRICS, University of Aarhus

2Garbage Collection

The Garbage CollectorThe Garbage Collector

A garbage collector is part of the runtime system It reclaims heap-allocated records (objects) that

are no longer in use

A garbage collector should:• reclaim all unused records• spend very little time per record• not cause significant delays• allow all of memory to be used

These are difficult and conflicting requirements

3Garbage Collection

Life Without Garbage CollectionLife Without Garbage Collection

Unused records must be explicitly deallocated This is superior if done correctly But it is easy to miss some records And it is dangerous to handle pointers Memory leaks in real life (ical v.2.1):

0

5

10

15

20

25

30

35MB

hours

4Garbage Collection

Record LivenessRecord Liveness

Which records are still in use? Ideally, those that will be accessed in the future

execution of the program But that is of course undecidable...

Basic conservative approximation:

A record is live if it is reachable from a stack location (local variable or local stack)

Dead records may still point to each other

5Garbage Collection

A Heap With Live and Dead RecordsA Heap With Live and Dead Records

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

6Garbage Collection

The Mark-and-Sweep AlgorithmThe Mark-and-Sweep Algorithm

Explore pointers starting from all stack locations and mark all the records encountered

Sweep through all records in the heap and reclaim the unmarked ones

Unmark all marked records

Assumptions:• we know the start and size of each record in memory• we know which record fields are pointers• reclaimed records are kept in a freelist

7Garbage Collection

Pseudo Code for Mark-and-SweepPseudo Code for Mark-and-Sweep

function DFS(x) { if (x is a heap pointer) if (x is not marked) { mark x; for (i=1; i<=|x|; i++) DFS(x.fi) }}

function Sweep() { p = first address in heap; while (p<last address in heap) { if (p is marked) unmark p; else { p.f1 = freelist; freelist = p; } p = p + sizeof(p); }}

function Mark() { foreach (v in a stack frame) DFS(v);}

8Garbage Collection

Marking and Sweeping (1/11)Marking and Sweeping (1/11)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

9Garbage Collection

Marking and Sweeping (2/11)Marking and Sweeping (2/11)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

10Garbage Collection

Marking and Sweeping (3/11)Marking and Sweeping (3/11)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

11Garbage Collection

Marking and Sweeping (4/11)Marking and Sweeping (4/11)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

12Garbage Collection

Marking and Sweeping (5/11)Marking and Sweeping (5/11)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

13Garbage Collection

Marking and Sweeping (6/11)Marking and Sweeping (6/11)

p

q

r

37

15 12

7

3759

9

20

freelist

00017 00008

0004200113

00249

00371

00738

14Garbage Collection

Marking and Sweeping (6/11)Marking and Sweeping (6/11)

p

q

r

37

15 12

7

3759

9

20

freelist

00017 00008

0004200113

00249

00371

00738

15Garbage Collection

Marking and Sweeping (6/11)Marking and Sweeping (6/11)

p

q

r

37

15 12

7

3759

9

20

freelist

00017 00008

0004200113

00249

00371

00738

16Garbage Collection

Marking and Sweeping (7/11)Marking and Sweeping (7/11)

p

q

r

37

15 12

7

3759

20

freelist

00017 00008

0004200113

00249

00371

00738

17Garbage Collection

Marking and Sweeping (8/11)Marking and Sweeping (8/11)

p

q

r

37

15 12

7

3759

20

freelist

00017 00008

0004200113

00249

00371

00738

18Garbage Collection

Marking and Sweeping (9/11)Marking and Sweeping (9/11)

p

q

r

37

15 12

7

3759

20

freelist

00017 00008

0004200113

00249

00371

00738

19Garbage Collection

Marking and Sweeping (10/11)Marking and Sweeping (10/11)

p

q

r

37

15 12

7

3759

20

freelist

00017 00008

0004200113

00249

00371

00738

20Garbage Collection

Marking and Sweeping (11/11)Marking and Sweeping (11/11)

p

q

r

37

15 12

7

3759

20

freelist

00017 00008

0004200113

00249

00371

00738

21Garbage Collection

Analysis of Mark-and-SweepAnalysis of Mark-and-Sweep

Assume the heap has H words Assume that R words are reachable The cost of garbage collection is:

c1R + c2H

The cost per reclaimed word is:

(c1R + c2H)/(H - R)

If R is close to H, then this is expensive

22Garbage Collection

AllocationAllocation

The freelist must be searched for a record that is large enough to provide the requested memory

Free records may be sorted by size The freelist may become fragmented:

containing many small free records but none that is large enough

Defragmentation joins adjacent free records

23Garbage Collection

Pointer ReversalPointer Reversal

The DFS recursion stack could have size H It has at least size log(H) This may be too much (after all, memory is low)

The recursion stack may be cleverly embedded in the fields of the marked records

This technique makes mark-and-sweep practical

24Garbage Collection

The Reference Counting AlgorithmThe Reference Counting Algorithm

Maintain a counter of the total number of references to each record

For each assignment, update the counters A record is dead when its counter is zero Advantages:

• catches dead records immediately• does not cause long pauses

Disadvantages:• cannot detect cycles of dead records• is rather expensive

25Garbage Collection

Pseudo Code for Reference CountingPseudo Code for Reference Counting

function Increment(x) { x.count++;}

function Decrement(x) { x.count--; if (x.count==0) PutOnFreeList(x);}function PutOnFreelist(x) {

Decrement(x.f1); x.f1 = freelist; freelist = x;}

function RemoveFromFreelist(x) { for (i=2; i<=|x|; i++) Decrement(x.fi);}

26Garbage Collection

The Stop-and-Copy AlgorithmThe Stop-and-Copy Algorithm

Divide the heap space into two parts Only use one part at a time When it runs full, copy live records to the other

part of the heap space Then switch the roles of the two parts Advantages:

• fast allocation (no freelist)• avoids fragmentation

Disadvantage:• wastes half your memory

27Garbage Collection

Before and After Stop-and-CopyBefore and After Stop-and-Copy

8

7

6

4

3

5

from-space to-space

nextlimit

8

7

6

5

4

3

to-space from-spacelimit

next

28Garbage Collection

Pseudo Code for Stop-and-CopyPseudo Code for Stop-and-Copy

function Forward(x) { if (x from-space) { if (x.f1 to-space) return x.f1; else for (i=1; i<|x|; i++) next.fi = x.fi; x.f1 = next; next = next + sizeof(x); return x.f1; } else return x;}

function Copy() { scan = next = start of to-space; foreach (v in a stack frame) v = Forward(v); while (scan < next) { for (i=1; i<=|scan|; i++) scan.fi = Forward(scan.fi); scan = scan + sizeof(scan); }}

29Garbage Collection

Stopping and Copying (1/13)Stopping and Copying (1/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

30Garbage Collection

Stopping and Copying (2/13)Stopping and Copying (2/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

31Garbage Collection

Stopping and Copying (3/13)Stopping and Copying (3/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

32Garbage Collection

Stopping and Copying (4/13)Stopping and Copying (4/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

3709012

33Garbage Collection

Stopping and Copying (5/13)Stopping and Copying (5/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

3709012

34Garbage Collection

Stopping and Copying (6/13)Stopping and Copying (6/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

3709012

1209024

35Garbage Collection

Stopping and Copying (7/13)Stopping and Copying (7/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

3709012

1209024

36Garbage Collection

Stopping and Copying (8/13)Stopping and Copying (8/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

3709012

1209024

37Garbage Collection

Stopping and Copying (9/13)Stopping and Copying (9/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

3709012

1209024

2000249

38Garbage Collection

Stopping and Copying (10/13)Stopping and Copying (10/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

3709012

1209024

2000936

39Garbage Collection

Stopping and Copying (11/13)Stopping and Copying (11/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

3709012

1209024

2000936

5900948

40Garbage Collection

Stopping and Copying (12/13)Stopping and Copying (12/13)

p

q

r

37

15 12

7

3759

9

20

00017 00008

0004200113

00249

00371

00738

from-spaceto-space

1500017

1509000

3709012

1209024

2000936

5900948

41Garbage Collection

Stopping and Copying (13/13)Stopping and Copying (13/13)

p

q

r

37

to-spacefrom-space

1509000

3709012

1209024

2000936

5900948

42Garbage Collection

Analysis of Stop-and-CopyAnalysis of Stop-and-Copy

Assume the heap has H words Assume that R words are reachable The cost of garbage collection is:

c3R

The cost per reclaimed word is:

c3R/(H/2 - R)

This has no lower bound as H grows

43Garbage Collection

Recognizing Records and PointersRecognizing Records and Pointers

Earlier assumptions:• we know the start and size of each record in memory• we know which record fields are pointers

For object-oriented languages, each record already contains a pointer to a class descriptor

For general languages, we must sacrifice a few bytes per record

For the stack frame:• use a bit per stack location• use a table per program point

44Garbage Collection

Conservative Garbage CollectionConservative Garbage Collection

For mark-and-sweep, we may use a conservative approximation to recognize pointers

A word is a pointer if it looks like one (its value is an address in the range of the heap space)

This will recognize too many pointers Thus, too many records will be marked as live

This does not work for stop-and-copy...

45Garbage Collection

Triggering Garbage CollectionTriggering Garbage Collection

A collection must be triggered when there is no more free heap space

But this may cause a long pause in the execution Collections may be triggered by heuristics:

• after a certain number of records have been allocated• when only a certain fraction of the heap is free• after a certain period of time• when the program is not busy

46Garbage Collection

Generational CollectionGenerational Collection

Observation: the young die quickly! The collector should focus on young records Divide the heap into generations: G0, G1, G2, ...

All records in Gi are younger than records in Gi+1

Collect G0 often, G1 less often, and so on

Promote a record from Gi to Gi+1 when it survives several collections

47Garbage Collection

Collecting a GenerationCollecting a Generation

How to collect the G0 generation:• roots are no longer just stack locations, but also

pointers from G1, G2, ...

• it could be expensive to find those pointers• fortunately they are rare, so we can remember them

Ways to remember pointers:• maintain a set of all updated records• mark pages of memory that contain updated records

(using hardware or software)

48Garbage Collection

Incremental CollectionIncremental Collection

A garbage collector creates (long) pauses This is bad for real-time programs

An incremental collector runs concurrently with the program (in a separate thread)

It must now handle simultaneous heap updates

49Garbage Collection

The Tricoloring AlgorithmThe Tricoloring Algorithm

Records are colored black, grey, or white

visited and all children visited visited, but not all children visited not visited

The program may update the heap as it pleases, but must maintain an invariant:

no black record points to a white record

50Garbage Collection

Function Tricolor() { color all records white;

color all roots grey;

while (more grey records) {

x = a grey record;

for (i=1; i<=|x|; i++)

color x.fi grey;

color x black;

}

reclaim all white records;

}

Pesudo Code for TricoloringPesudo Code for Tricoloring

51Garbage Collection

Maintaining the InvariantMaintaining the Invariant

Write barriers:

x.fi = y; black2grey(x).fi = y;

Read barriers:

x.fi = y; x.fi = white2grey(y);

Requires synchronizations between the running program and the collector

52Garbage Collection

Garbage Collection in JavaGarbage Collection in Java

Sun's HotSpot VM uses by default:• two generations: "nursery" and "old objects"• the nursery is collected using stop-and-copy• the old objects are collected using mark-and-sweep in

a version that also compacts the live records

For real-time applications:• use option -Xincgc• a more sophisticated incremental algorithm• 10% slower• but with shorter pauses

53Garbage Collection

FinalizersFinalizers

If an object has a finalize() method, it will be invoked before the object is reclaimed by the garbage collector

But there is no guarantee how soon this happens

This method may actually resurrect the object Typically, the garbage collector needs an extra

pass to find out if the dead really stay dead

54Garbage Collection

Interacting With the Garbage CollectorInteracting With the Garbage Collector

Trigger the garbage collector manually:• System.gc();

The java.lang.ref package allows variations of the pointer concept:• SoftReference• WeakReference

55Garbage Collection

Soft ReferencesSoft References

The garbage collector may reclaim an object that has soft references but no ordinary (strong) references

This is typically used for caching:SoftReference sr = null;

...

Image img;

if (sr == null) {

img = getImage("huge.gif");

sr = new SoftReference(img);

} else

img = (Image)sr.get();

display(img);

img = null;

56Garbage Collection

Weak ReferencesWeak References

The garbage collector will reclaim an object that has weak references but no strong or soft references

This is used in java.util.WeakHashMap, where keys are automatically removed when they are no longer in use