cs61-exercises/fundamentals5x directory contains a program called
nogcbench that benchmarks a dynamic memory allocator,
nogcbench program allocates an array, called
chunks, which can hold pointers to K chunks of memory. Then, N times, it frees a randomly-chosen chunk and allocates another one. Finally, it frees all chunks and the
chunks array and exits.
It also contains a
gcbench program that does the same thing—except that it never calls free. If you run
gcbench on Linux, it will quickly run out of memory and crash.
Your job is to finish the conservative garbage collector in
m61gc.c so that
gcbench completes all 1000000 allocations and exits with no memory leaks—even though
gcbench never explicitly calls free.
All the places you’ll modify are in
Mark-sweep garbage collection
CS:APP3e section 9.10 describes garbage collection in more detail.
The job of a garbage collector is to reclaim memory that hasn’t been explicitly freed, but that the program will never access again. The GC might run, for example, when the program runs out of memory. C is generally not a garbage-collected language, but there do exist garbage collectors for C, and you’ll write one today!
We can distinguish between precise and conservative garbage collectors. A precise GC precisely detects all garbage: if an allocation cannot be referenced, it will be freed. Precise GC requires a strongly-typed language. Conservative GCs might get confused, but they only get confused in a safe direction (that’s why we call them “conservative”). They might think that an unreferenceable allocation is live (i.e., not garbage), but they will never think that a live allocation is unreferenceable.
At a high level, a GC acts like this.
- The collector examines all of memory and finds all active allocations.
- The collector frees all inactive allocations.
The first step is the hardest. How can the GC find which allocations are active? Your GC, like many C GCs, will look for patterns in memory that look like pointers into the heap. A heap allocation is live iff a pointer to that allocation is stored somewhere in memory. So where to start? Your GC will look for pointers in the stack and in global variables.
But remember that heap pointers might also be stored on the heap! For instance, in the
gcbench program, pointers to chunks are stored in the
chunks array, which is itself dynamically allocated. So if we find a live allocation, we need to then recursively check the contents of that allocation for more pointers.
m61_gcto unmark all active allocations. Use the
markedmember of the
mark_allocations. If a valid pointer is found in the indicated range of memory, mark that allocation—and additionally check its contents for more pointers. Hint: Check out
m61_gcto free inactive allocations.
When you complete this,
./gcbench should complete without crashing and ending with either
0 allocations or some other small number!
When you first complete your GC, it will likely be pretty slow. Speed it up as much as you can!
Q1. The first change I see in
// unmark all active allocations // YOUR CODE HERE
However, at this point, I don't know which allocations are active, so how can I do this?
A1. This is an initialization step: the state of your
allocs is as you left it when your last run of the garbage collector finished -- that is all the allocations that were active last time have been left in your table. Before beginning a new GC pass, you want to clear all the marked bits so that you'll be able to tell which allocations get marked this time.
This one’s pretty long, mostly because we talk through and debug some common errors.
- 7:29—starting the exercise
- 8:40–11:52—some common errors in the “unmark all allocations” loop
- 17:56–20:55—some common errors in
- 23:52—a non-crashing solution!
- 23:52–27:42—performance optimization: avoid double-marking
- 28:30–36:50—precision fix: free all unmarked allocations