Problem set 1: Debugging memory allocator

Goal

Most systems programming requires the explicit allocation and deallocation of dynamic memory. This lets us write fast code for modern machines, since we have full control over memory; and, sadly, it lets us write programs that crash due to memory problems. But we can also build tools that help us debug memory allocation problems.

In this problem set, you will write a debugging memory allocator. Specifically, your debugging allocator will:

You will also augment your debugging allocator with heavy hitter reporting that tells a programmer where most of the dynamically allocated memory is allocated.

The handout code for this problem set is in the pset1 directory. This contains a skeleton memory allocator and many test programs. Your code will go in m61.cc.

Mechanics

This problem set serves several purposes for us and you.

Collaboration

This problem set is subject to the CS 61 collaboration policy. We recommend that students implement the debugging allocator (everything except heavy hitters) using unshared code (i.e., discussion and debugging with other students is fine with citation, but implement the code individually). Other divisions of labor are acceptable; ask staff if you have a concern.

Getting the code

Problem sets are released through GitHub Classroom. Our GitHub organization is “cs61”. Problem sets will be released using the cs61/cs61-f19-psets repository.

Please click this link to create your own private clone of the problem sets repository. (The link will ask you to choose a name for your team; choose whatever you like.) You’ll clone that private repository onto your computer, do work there, and then push your work upstream to the GitHub-hosted repository for us to grade.

Here’s how it should work.

  1. Click the link.
  2. Log in to GitHub, or create an account.
  3. Provide a team name.
  4. The link should automagically clone the repository. For instance, if your team name was kohler-dumb, you should get a repository called cs61/cs61-f19-psets-kohler-dumb.
  5. Use the "Clone or download" button to clone your repository.

Working with a partner? No problem. Find out their GitHub username, and then add them to your team as a collaborator using Settings > Collaborators & teams.

Infrastructure help

Planning, coding, and testing

Like much systems programming, the problem sets in this class reward incremental work and test-driven development. Our handout code comes with many tests, and our make check targets run your code against all those tests in order. It’s a good idea to work on the tests in order: get the simple stuff working, then gradually add more complex functionality.

We encourage each student to complete this work independently (coding without a partner). Our solution involves tens of lines of code (not hundreds), which is modest, but doing the work yourself will teach you a lot.

The make check commands will print out differences between actual and expected outputs for each test. To print differences for a single test, try make check-N, as in make check-1. If you want to see the full output of a test, without worrying about comparisons, run that test on the command line; for example:

$ make test001
$ ./test001

Timing

A graduated submission process will encourage you to get started on the pset early. Only the last submission will be graded.

  1. First, every individual student must commit and upload an initial checkin by midnight on Friday 9/13 (one day later for extension). Try to do this work on your own. Your m61 should pass at least tests test001.cc through test012.cc, but remember, this submission will not be graded.
  2. Second, every student must commit and upload a final version, including heavy hitters, by Friday 9/20 (one day later for extension). Only the grade from this version will count. If you worked with a partner, note them in the appropriate field on the grading server and make sure that some part of your submission is unique.

In a team of a college student and an extension student (which we allow), the extension deadline counts.

It is important to get started early—this problem set is not trivial even though you don’t have to write that much code to make it work.

Turnin

You will turn in your code using Git. In the pset1/ subdirectory, you MUST include two text files called README.md and AUTHORS.md. The README.md file will include any extra credit attempted and comments for graders. The AUTHORS.md file should list your names as well as any other collaborators and citations. (Just list “None” if you have none, but we’d be surprised if you had no citations for the heavy hitter part of the pset.)

Your submission will not be graded until you configure the grading server. Log in and enter your GitHub username and repository URL. Click “pset1” to initiate a fetch of your repository from code.seas. Verify that all tests that you expect to pass continue to pass on the grading server.

Grading breakdown:

Context

C-style memory allocation uses two basic functions, malloc and free.

void* malloc(size_t size)
Allocate size bytes of memory and returns a pointer to it. This memory is not initialized (it can contain anything). Returns nullptr if the allocation failed (because size was too big, or memory is exhausted, or for whatever other reason).

void free(void* ptr)
Free a single block of memory previously allocated by malloc.

The rules of malloc and free are simple: Allocate once, then free once.

These errors are called invalid frees. The third error is also called a double free.

Some notes on boundary cases:

A secondary memory allocation function, calloc, allocates memory for an array of objects and clears it to zero.

void* calloc(size_t nmemb, size_t sz) {
    void* ptr = malloc(sz * nmemb);
    if (ptr != nullptr) {
        memset(ptr, 0, sz * nmemb);     // clear memory to 0
    }
    return ptr;
}

(NB: There’s actually a bug there! One of our tests would find it.)

You will work on our replacements for these functions, which are called m61_malloc, m61_free, and m61_calloc. Our versions of these functions simply call basic versions, base_malloc and base_free. The m61 functions take extra filename and line number arguments; you’ll use these to track where memory was allocated and to report where errors occur. Our header file, m61.hh, uses macros so that calls in the test programs supply these arguments automatically.

A note on undefined behavior

Debugging allocators interact with undefined behavior. As we tell you in class, undefined behavior is a major no-no, because any program that invokes undefined behavior has no meaning. As far as the C++ language standard is concerned, once undefined behavior occurs, a program may do absolutely anything, such as force demons to fly out of your nose. Many of our tests explicitly invoke undefined behavior, and thus have no meaning. But helpful debuggers catch common bugs, and undefined-behavior bugs with malloc and free are common, so your helpful debugging allocator must produce specific warnings for these cases! Is that even possible?

Yes! Debugging allocators work by making certain undefined behaviors well-defined. For instance, when a debugging allocator is in force, a program with a simple double free will reliably print a specific error message and exit before any undefined behavior occurs.

To accomplish this magic trick, debugging allocators make use of properties of their underlying allocators. You should use our “base” allocator, base_malloc and base_free, which is defined in basealloc.cc. This allocator behaves like malloc and free, but has the following additional properties:

This makes it much easier to write a debugging allocator with base_malloc/free than with C’s default malloc/free. For instance, the following program is well-defined:

int main(int argc, char* argv[]) {
    int* x = base_malloc(sizeof(int));
    *x = 10;
    base_free(x);
    assert(*x == 10); // will always succeed: base_free doesn’t overwrite freed memory
}

In contrast, this program, which uses the system malloc and free, always exhibits undefined behavior (and the assertion will likely fail).

int main(int argc, char* argv[]) {
    int* x = malloc(sizeof(int));
    *x = 10;
    free(x);
    assert(*x == 10); // use-after-free undefined behavior
}

This is because undefined behavior is imposed by language-defined interfaces, such as the system malloc and free. While base_malloc and base_free are used in the same way as malloc and free, they are not language-defined, so they can offer slightly more relaxed rules.

Using the base_ allocator functions, your debugging allocator should run tests 1 through 26 completely safely, with no sanitizer warnings. (Use make SAN=1 check-1-26 to check.) Even though some of the test programs, such as test020, appear to invoke undefined behavior, your debugging allocator library will catch the problems before the undefined behavior actually occurs.

Note that even for base_malloc and base_free, double frees, invalid frees, and wild writes can cause truly undefined behavior. The following program is just as bad as a version using the system allocator.

int main(int argc, char* argv[]) {
    int* x = base_malloc(sizeof(int));
    base_free(x);
    base_free(x); // UNDEFINED BEHAVIOR DEATH
}

Debugging allocator (70%)

This section describes the main functionality required for the debugging allocator, ordered from simpler to more complex, and from low-numbered to high-numbered tests.

Statistics

Implement the following function:

void m61_get_statistics(m61_statistics* stats)
Fill in the stats structure with overall statistics about memory allocations so far.

The m61_statistics structure is defined like this:

struct m61_statistics {
    unsigned long long nactive;           // number of active allocations [#malloc - #free]
    unsigned long long active_size;       // number of bytes in active allocations
    unsigned long long ntotal;            // number of allocations, total
    unsigned long long total_size;        // number of bytes in allocations, total
    unsigned long long nfail;             // number of failed allocation attempts
    unsigned long long fail_size;         // number of bytes in failed allocation attempts
    uintptr_t heap_min;                   // smallest address in any region ever allocated
    uintptr_t heap_max;                   // largest address in any region ever allocated
};

Most of these statistics are easy to track, and you should tackle them first. You can pass tests 1-5 and 7–10 without per-allocation metadata.

The hard test is active_size, since to track it, your m61_free(ptr) implementation must find the number of bytes allocated for ptr. The easiest, and probably best, way to do this is for your m61_malloc code to request more space than the user when it calls base_malloc. The initial portion of that space will store metadata about the allocation, including the allocated size, using a structure you define yourself. Your m61_malloc will initialize this metadata, and then return a pointer to the payload, which is the portion of the allocation following the metadata. Your m61_free code will take the payload pointer as input, and then use address arithmetic to calculate the pointer to the corresponding metadata (possible because the metadata has fixed size). From that metadata it can read the size of the allocation. See CS:APP3e Figure 9.35 “Format of a simple heap block” for an example of this type of layout.

(But there are other techniques too. You could create a hash table that mapped pointer values to sizes. m61_malloc would add an entry, and m61_free would check this table and then remove the entry. You might try this first.)

Run make check to test your work. Test programs test001.cc through test012.cc test your overall statistics functionality. Open one of these programs and look at its code. You will notice some comments at the end of the file, such as this:

//! alloc count: active          0   total          0   fail          0
//! alloc size:  active          0   total          0   fail          0

These lines define the expected output for the test. The make check command checks your actual output against the expected output and reports any discrepancies. (It does this by calling compare.pl.)

Integer overflow protection

Your debugging malloc library should support the calloc secondary allocation function. It also needs to be robust against integer overflow attacks. (See, for example CS:APP3e Aside "Security vulnerability in the XDR library", in section 2.3, p100.)

Our handout code’s m61_calloc function is not quite right. Fix this function and fix any other integer overflow errors you find.

Test programs test013.cc through test015.cc check your work.

Invalid free and double-free detection

m61_free(ptr, file, line) should print an error message and then call C’s abort() function when ptr does not point to active dynamically-allocated memory.

Some things to watch:

MEMORY BUG: test016.cc:8: invalid free of pointer 0xffffffffffffffe0, not in heap

Test programs test016.cc through test024.cc check your work.

Boundary write error detection

A boundary error is when a program reads or writes memory beyond the actual dimensions of an allocated memory block. An example boundary write error is to write the 11th entry in an array of size 10:

int* array = (int*) malloc(10 * sizeof(int));
...
for (int i = 0; i <= 10 /* WHOOPS */; ++i) {
    array[i] = calculate(i);
}

These kinds of errors are relatively common in practice. (Other errors can happen, such as writing to totally random locations in memory or writing to memory before the beginning of an allocated block, rather than after its end; but after-the-end boundary writes seem most common.)

A debugging memory allocator can’t detect boundary read errors, but it can detect many boundary write errors. Your m61_free(ptr, file, line) should print an error message and call abort() if it detects that the memory block associated with ptr suffered a boundary write error.

No debugging allocator can reliably detect all boundary write errors. For example, consider this:

int* array = (int*) malloc(10 * sizeof(int));
int secret = array[10];     // save boundary value
array[10] = 1384139431;     // boundary write error
array[10] = secret;         // restore old value! dmalloc can’t tell there was an error!

Or this:

int* array = (int*) malloc(10 * sizeof(int));
array[200000] = 0;          // a boundary write error, but very far from the boundary!

We’re just expecting your code to catch common simple cases. You should definitely catch the case where the user writes one or more zero bytes directly after the allocated block.

Test programs test025.cc through test027.cc check your work.

Memory leak reporting

A memory leak happens when code allocates a block of memory but forgets to free it. Memory leaks are not as serious as other memory errors, particularly in short-running programs. They don’t cause a crash. (The operating system always reclaims all of a program’s memory when the program exits.) But in long-running programs, such as your browser, memory leaks have serious effect and are important to avoid.

Write an m61_print_leak_report() function that, when called, prints a report about every allocated object in the system. This report should list every object that has been malloced but not freed. Print the report to standard output (not standard error). A report should look like this:

LEAK CHECK: test033.cc:23: allocated object 0x9b811e0 with size 19
LEAK CHECK: test033.cc:21: allocated object 0x9b81170 with size 17
LEAK CHECK: test033.cc:20: allocated object 0x9b81140 with size 16
LEAK CHECK: test033.cc:19: allocated object 0x9b81110 with size 15
LEAK CHECK: test033.cc:18: allocated object 0x9b810e0 with size 14
LEAK CHECK: test033.cc:16: allocated object 0x9b81080 with size 12
LEAK CHECK: test033.cc:15: allocated object 0x9b81050 with size 11

A programmer would use this leak checker by calling m61_print_leak_report() before exiting the program, after cleaning up all the memory they could using free calls. Any missing frees would show up in the leak report.

To implement a leak checker, you’ll need to keep track of every active allocated block of memory. It’s easiest to do this by adding more information to the block metadata. You will use the file and line arguments to m61_malloc().

Note: You may assume that the file argument to these functions has static storage duration. This means you don’t need to copy the string’s contents into your block metadata—it is safe to use the string pointer.

Test programs test028.cc through test030.cc check your work.

Advanced reports and checking

Test programs test031.cc, test032.cc, and test033.cc require you to update your reporting and error detection code to print better information and defend against more diabolically invalid frees. You will need to read the test code and understand what is being tested to defend against it.

Update your invalid free message. After determining that a pointer is invalid, your code should check whether the pointer is inside a different allocated block. This will use the same structures you created for the leak checker. If the invalid pointer is inside another block, print out that block, like so:

MEMORY BUG: test031.cc:10: invalid free of pointer 0x833306c, not allocated
 test031.cc:9: 0x833306c is 100 bytes inside a 2001 byte region allocated here

And make sure your invalid free detector can handle the diabolical situations in the other tests. What situations? Check the test code to find out!

Performance and C++ integration

Finally, test programs test034.cc and up test other situations. test034 calls m61_malloc 500,000 times, supplying tens of thousands of different filename/line-number pairs. Your solution should run test034 in a second or less. Our solution, with heavy hitters, runs test034 in about 0.2 seconds on a modern Mac laptop. Some of the other tests repeat earlier tests, but using C++-style memory allocation instead of C-style memory allocation.

Style (10%)

Your code should be clean, clear, correct, and consistent. The most important style guideline is consistency. Our handout code follows a certain format (indentation amount, comment usage, variable names). We won’t grade you down if you choose a different format. But pick a format. Don’t write code that changes style from line to line.

For this problem set, additional coding guidelines will be incorporated into your grade.

Heavy hitter reports (20%)

Memory allocation is expensive, and you can speed up a program a lot by optimizing how that program uses malloc and by optimizing malloc itself. (Did you know that both Google and Facebook have malloc specialists? Here’s Google’s tcmalloc, and Facebook liked jemalloc so much that they hired Jason Evans.)

But before optimizing a program, we must collect data. Programmer intuition is frequently wrong: programmers tend to assume the slowest code is either the code they found most difficult to write or the last thing they worked on. This recommends a memory allocation profiler—a tool that tracks and reports potential memory allocation problems.

Your job is to design and implement a particular kind of profiling, heavy hitter reports, for your memory allocator. This has two parts. You will:

  1. Track the heaviest users of malloc() by code location (file and line). A “heavy” location is a location that is responsible for allocating many payload bytes.

  2. Generate a readable report that summarizes this information, using this exact format (each location on one line, sorted in descending order by percentage):

HEAVY HITTER: hhtest.cc:49: 1643786191 bytes (~50.1%)
HEAVY HITTER: hhtest.cc:48: 817311692 bytes (~25.0%)
HEAVY HITTER: hhtest.cc:47: 403156951 bytes (~12.4%)

Rule 1: If a program makes lots of allocations, and a single line of code is responsible for 20% or more of the total payload bytes allocated by a program, then your heavy-hitter report should mention that line of code (possibly among others).

Rule 2: Your design should handle both large numbers of allocations and large numbers of allocation sites. In particular, you should be able to handle a program that calls malloc() at tens of thousands of different file-line pairs (such as test034).

Rule 3: Your report should include some information that helps the user decide which lines are likely to be the heaviest hitters, such as exact or estimated byte counts, or even just by ordering the hitters.

Note that you should only count payload bytes: don’t include allocator overhead such as metadata space.

How should you implement this? That’s up to you, but here are some pointers.

We provide a test program for you to test heavy hitter reports, hhtest.cc. This program contains 40 different allocators that allocate regions of different sizes. Its first argument, the skew, varies the relative probabilities that each allocator is run. Before exiting, it calls m61_print_heavy_hitters(), a function defined in m61.cc.

Running ./hhtest 0 will call every allocator with equal probability. But allocator #39 allocates twice as much data as any other. So when we run our simple heavy hitter detector against ./hhtest 0, it reports:

HEAVY HITTER: hhtest.cc:49: 1643786191 bytes (~50.1%)
HEAVY HITTER: hhtest.cc:48: 817311692 bytes (~25.0%)
HEAVY HITTER: hhtest.cc:47: 403156951 bytes (~12.4%)

If we run ./hhtest 1, however, then the first allocator (hhtest.cc:10) is called twice as often as the next allocator, which is called twice as often as the next allocator, and so forth. There is almost no chance that allocator #39 is called at all. The report for ./hhtest 1 is:

HEAVY HITTER: hhtest.cc:10: 499043 bytes (~50.0%)
HEAVY HITTER: hhtest.cc:11: 249136 bytes (~25.0%)
HEAVY HITTER: hhtest.cc:12: 123995 bytes (~12.5%)

At some intermediate skews, though, and there may be no heavy hitters at all. Our code reports nothing when run against ./hhtest 0.4.

Negative skews call the large allocators more frequently. ./hhtest -0.4:

HEAVY HITTER: hhtest.cc:49: 15862542908 bytes (~62.1%)
HEAVY HITTER: hhtest.cc:48: 6004585020 bytes (~23.5%)

Try ./hhtest --help to get a full description of hhtest’s arguments. You should test with many different arguments; for instance, make sure you try different allocation “phases.” A great software engineer would also create tests of her own; we encourage you to do this!

This idea can be taken quite far. Google, for example, links a heavy-hitter detector with many important servers. It is possible (within Google) to connect to many servers and generate a graph of its current heavy hitter allocation sites, including their calling functions and relationships among functions. Here’s a little example (scroll down the page); here’s a bigger one. Have fun!!

Debugging allocators have a long history. dmalloc is one of the early ones; here’s a list of some others. Modern compilers integrate both optional allocation debugging features and some debugging features that are on by default. For instance, Mac OS and Linux’s memory allocators can detect some boundary write errors, double frees, and so forth. Recent clang and gcc -fsanitize=memory arguments can catch even more problems.

Feel free to look at the code and documentation for other allocators to get ideas, but make sure you cite them if you do.

Extra credit

We encourage you to learn more and flex your coding muscles by doing some extra credit work! We will grade extra credit attempts, but we aren’t publishing any formula about how extra credit feeds into final grades, and extra credit is in no sense required to get an A.

Reallocation

Easy. Implement m61_realloc. This function is based on the C standard library’s realloc, and should behave like this:

/// m61_realloc(ptr, sz, file, line)
///    Reallocate the dynamic memory pointed to by `ptr` to hold at least
///    `sz` bytes, returning a pointer to the new block. If `ptr` is
///    `nullptr`, behaves like `m61_malloc(sz, file, line)`. If `sz` is 0,
///    behaves like `m61_free(ptr, file, line)`. The allocation request
///    was at location `file`:`line`.

void* m61_realloc(void* ptr, size_t sz, const char* file, long line);

Frequent allocations

Moderately easy. Make your heavy-hitter detector find frequent allocations as well as heavy allocations. Allocating a single byte a billion times is expensive, even if the byte is immediately freed each time! So augment your heavy hitter detector to report call sites that are responsible for 20% or more of the total number of allocations the program made, even if those call sites aren’t responsible for many total bytes relative to other call sites. For good coding style, your solution should involve some code reuse.

Sanitizer

Moderately difficult. Modern debugging memory allocators are tightly integrated with C and C++ compilers: the compilers generate data structures that the debugging allocators use at runtime to detect mistakes. This has many advantages, including efficiency and precision.

Research a debugging memory allocator integrated with a compiler and runtime of your choice, such as GCC/Clang’s AddressSanitizer, and write a brief document describing how it works. You can do this by reading information online (with citation) and/or by running your own experiments.

Line number lookup

Difficult. Most of our test programs use macros that redefine malloc and free to supply filename and line number information. But C++ discourages the use of fancy macros, and C++-style allocation, such as the m61_allocator class used in test035 and up, has no place for macros to go. A C++ debugging memory allocator will use return address information to identify call sites.

Change your debugging allocator so that m61_allocator’s allocate and deallocate functions pass information based on __builtin_return_address to the m61 functions. Then, make your allocator transform this information numbers into real filename–line number pairs just before printing. For instance, you can call out to an external program, such as Linux’s addr2line. But do not perform that transformation until printing, since the transformation is expensive.

Use ./test037 to check your work. Before your fixes, the LEAK CHECK lines will start with ?:1; afterwards, they should start with something like .../test037.cc:NN.

Notes

This lab was created for CS 61.