topic 17: garbage collection -...

28
Topic 17: Garbage Collection 1 Compiler Design Prof. Hanjun Kim CoreLab (Compiler Research Lab) POSTECH

Upload: phamphuc

Post on 29-Apr-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Topic 17: Garbage Collection

1

Compiler Design

Prof. Hanjun Kim

CoreLab (Compiler Research Lab)

POSTECH

Garbage Collection

• Garbage• A value that will not be used in any subsequent

computation by a program

• Garbage Collection• Operation that makes space belonging to garbage data

available for reuse

• Is GC important?• Many modern programming languages allow

programmers to allocate new storage dynamically• New records, arrays, tuples, objects, closures, etc.

• They need facilities for reclaiming and recycling the storage used by programs

• Who will determine which objects are garbage?

2

A solution

• Explicit Memory Management• User library manages memory; programmer decides

when and where to allocate and deallocate• void* malloc(long n)

• void free(void *addr)

• Library calls OS for more pages when necessary

• Advantage: people are smart

• Disadvantage: people are dumb and they really don’t want to bother with such details if they can avoid it

• Always worrying about dangling pointers, memory leaks: a huge software engineering burden

3

Another Solution

• Automatic Memory Management• How do we decide which objects are garbage?

• Can’t do it exactly

• Therefore, We conservatively approximate

• Normal solution: an object is garbage when it becomes unreachable from the roots

• The roots = registers, stack, global static data

• If there is no path from the roots to an object, it cannot be used later in the computation so we can safely recycle its memory

4

Object Graph

5

r1

stack

r2

• How should we test reachability?

Algorithms

• Reference Counting

• Mark and Sweep

• Copying Collection• Basic

• Cheney’s algorithm

• Generational Algorithm

• Incremental Algorithm• Baker’s algorithm

6

Reference Counting

• Each object has a reference count

• Reference count• Number of references to the object

• Initially, 1

• If reference count becomes 0, the object is garbage

7

Reference Counting

obj = p

• Algorithm• Before the instruction

• Decrease reference count obj

• If count == 0, put obj on free list

• After the instruction• Increase reference count obj

• Changed code• obj.count--;if obj.count == 0, putOnFreeList(obj);

obj = p;

obj.count++;

8

Reference Counting

9

r1

stack

r2

1 1

1

2 1

3

1

1

1

10

Reference Counting

10

r1

stack

r2

1 1

1

2 1

2

1

1

1

00

Reference Counting

11

r1

stack

r2

1 1

1

2 1

2

1

1

1

00

Reference Counting

• Pros• Simple!

• Cons• Very Expensive!

• Manage counts for each assignment

• Cycles of garbage cannot be claimed!• Need to check reachability

12

Mark and Sweep

• Marking• Assume that all objects are unreached

• Mark all the reachable nodes from roots with depth-first search algorithm

• Pseudo codefunction DFS(x)

if x is a pointer into the heap

if x is not marked

mark x

for each field f of x

DFS(x.f)

function marking()

for each root v

DFS(v)

13

Mark and Sweep

• Sweeping• Place all the unreached objects into the freelist

• Pseudo codeFunction sweeping()

p = first address in the heap

while p < last address in the heap

if p is marked

unmark p

else

addToFreelist(p)

p = p + sizeof(p)

• Fragmentation Problem• When a program allocates a record of size n, there are many

free spaces smaller than n, but none of them is larger than n

14

Mark and Sweep

15

r1

stack

r2

Copying Collection

• Basic Idea: use 2 heaps• One used by program (active heap)

• The other unused until GC time

• GC:• Start at the root sets & traverse the reachable data

• Copy reachable data from the active heap (from-space) to the other heap (to-space)

• Dead objects are left behind in from-space

• Heaps switch roles

16

Copying Collection

to-spacefrom-space

roots

Cheny’s algorithm

• Copying collection based on breadth-first search

• Pseudo code• Function Cheny()

scan = next = beginning of to-space

for each root r

r = forward(r) // it increases next

while scan < next

for each field f of record scan

scan.f = forward(scan.f)

scan = scan + sizeof(scan)

18

Cheny’s algorithm

• Before GC

19

root

scan

next

Cheny’s algorithm

• Forward roots

20

root

scan

next

Cheny’s algorithm

• Forward records b/w scan and next

21

root

scan

next

Cheny’s algorithm

• Forward records b/w scan and next

22

root scan

next

Cheny’s algorithm

• Forward records b/w scan and next

23

rootscan

next

Cheny’s algorithm

• Done when next = scan

24

rootscan

next

Generational GC

• Observation• If an object has been reachable for a long time, it is likely

to remain so

• Most objects died young

• Conclusion• Do GC for the young objects frequently

• Avoid scanning the old objects

• Generational GC• Divide the heap into partitions P0, P1, …

• Each partition holds older objects than one before it

25

Generational GC

• Create new objects in P0

• When P0 fills,• Garbage collect P0 only

• Move the reachable objects to P1

• When P1 fills• Garbage collect P0 and P1

• Move the reachable objects to P1 and P2 respectively

26

Incremental GC

• Observation• GC sometimes interrupt the program for long periods

• The long response time may cause crucial problems especially for interactive or real-time programs

• Solution • Incremental (Concurrent) GC

• Run GC in parallel with mutation (program execution)

27

Baker’s algorithm

• Based on Cheney’s copying collection

• When GC initiated,• Change the roles of from-space and to-space• Forward all the roots• Resume mutation

• When the mutator allocates memory,• Scan a few pointers• scan advances toward next

• Return memory in the to-space

• When the mutator fetches data from from-space• Forward the pointer to to-space• Extra fetch code = 20% performance penalty• But no long pauses ==> better response time

28