Problem set 1: Debugging memory allocator

Assigned Fri 9/1
Ungraded intermediate checkin due Sun 9/10 at 11:59:59pm (see Timing, below, for more; every student should turn in an ungraded individual checkin)
Due Fri 9/15 at 11:59:59pm (Sat 9/16 at 11:59:59pm for extension students)
This problem set may be completed in pairs, although we encourage students to try the early parts individually.
We recommend using a Linux virtual machine (the CS61 Appliance) for programming problem sets. See the Infrastructure page for instructions on installing a hypervisor and VM. While some psets will require that you use the CS61 Appliance, you can likely do this problem set on a Mac OS X machine or Linux machine natively. (You must have development tools installed, e.g., Xcode, including the “Command Line Tools” for OSX; gcc for Linux. On OSX, we recommend installing Homebrew or MacPorts so you can easily add software as needed.)

Goal

C programmers (that would be us) allocate and free memory explicitly. This means we can write fast code for modern machines, because we have full control over memory. The bad news is that it's all too easy to write programs that crash due to memory problems. But wait: as systems programmers, we can build tools to help us debug memory allocation problems. For instance, in this problem set, you will transform a simple memory allocator (e.g., implementation of malloc and friends) into a debugging memory allocator.

Task

1. Transform the malloc library we give you into a debugging malloc library that:

tracks memory usage,
catches common programming errors (e.g., use after free, double free),
detects writing off the end of dynamically allocated memory (e.g., writing 65 bytes into a 64-byte piece of memory), and
catches less common, somewhat devious, programming errors.

2. Augment your debugging malloc library with heavy hitter reporting that tells a programmer where most of the dynamically allocated memory is allocated.

The handout code for this problem set is in the pset1 directory. This contains a skeleton memory allocator and many test programs. Your code will go in m61.c. Once you have created a working copy of the problem set code, run make check or make check-all to see how many tests your implementation passes (hint: before you've done anything, you will fail all the tests). make check stops at the first error; make check-all runs all tests.

Getting the code

We’re releasing problem sets through GitHub Classroom. Our GitHub organization is “cs61”.

Problem sets will be released using the cs61/cs61-psets-f17 repository.

Please click this link to create your own private clone of the problem sets repository. (The link will ask you to choose a name for your team; choose whatever you like.) You’ll clone that private repository onto your computer, do work there, and then push your work upstream to the GitHub-hosted repository for us to grade.

9/1 note: We’ve been experiencing intermittent problems with the GitHub Classroom link. Often the first submission appears to fail, even though the import worked and correctly imported the problem set code. After submitting the form, go to https://github.com/cs61 to see if the import succeeded. If you still experience problems, go ahead and “fork” the cs61-psets-f17 repository normally using the public GitHub “Fork” button; we’ll fix things up later in office hours. But try the link first.

Here’s how it should work.

Click the link.
Log in to GitHub, or create an account.
Provide a team name.
The link should automagically clone the repository. For instance, if your team name was kohler-dumb, you should get a repository called cs61/cs61-f17-psets-kohler-dumb.
Use the “Clone or download” button to clone your repository.

Working with a partner? No problem. Find out their GitHub username, and then add them to your team as a collaborator using Settings > Collaborators & teams.

Infrastructure help

Context

C memory allocation uses two basic functions, malloc and free.

void* malloc(size_t size) Allocate size bytes of memory and returns a pointer to it. This memory is not initialized (it can contain anything). Returns NULL if the allocation failed (because size was too big, or memory is exhausted, or for whatever other reason).

void free(void* ptr) Free a single block of memory previously allocated by malloc.

The rules of malloc and free are simple: Allocate once, then free once.

Dynamically-allocated memory remains active until it explicitly freed with a call to free.
A successful call to malloc(sz) returns a pointer ptr to “new” dynamically allocated memory. This means that the sz bytes of data starting at address ptr are guaranteed not to overlap with the program's code, its global variables, its stack variables, or with any other active dynamically allocated memory.
The pointer argument in free(ptr) must either equal NULL or a pointer to active dynamically-allocated memory. In particular:
- It is not OK to call free(ptr) if ptr points to the program’s code, or into its global variables, or into the stack.
- It is not OK to call free(ptr) unless ptr was returned by a previous call to malloc.
- It is not OK to call free(ptr) if ptr is currently inactive (i.e., free(ptr) was previously called with the same pointer argument, and the ptr memory block was not reused by another malloc()).

These errors are called invalid frees. The third error is also called a double free.

Some notes on boundary cases:

malloc(0) may return either NULL or a non-NULL pointer. If ptr = malloc(0) is not NULL, then ptr does not overlap with any other allocation and can be passed to free().
free(NULL) is allowed. It does nothing.
malloc(sz) returns memory whose alignment works for any object. (We’ll discuss alignment in class; for a preview, see CS:APP3e §3.9.3.) On 64-bit x86 machines, this means that the address value returned by malloc() must be evenly divisible by 16. You should do this too.

Two secondary memory allocation functions are also commonly used in C, calloc and realloc. The calloc function allocates memory and clears it to zero. The realloc function can allocate, free, or resize memory depending on its arguments. These functions work like this:

 void* calloc(size_t nmemb, size_t sz) {
     void* ptr = malloc(sz * nmemb);
     if (ptr != NULL)
         memset(ptr, 0, sz * nmemb);     // clear memory to 0
     return ptr;
 }

 void* realloc(void* ptr, size_t sz) {
     void* new_ptr = NULL;
     if (sz != 0)
         new_ptr = malloc(sz);
     if (ptr != NULL && new_ptr != NULL) {
         size_t old_sz = `**`size of memory block allocated at ptr`**`;
         if (old_sz < sz)
             memcpy(new_ptr, ptr, old_sz);
         else
             memcpy(new_ptr, ptr, sz);
     }
     free(ptr);
     return new_ptr;
 }

(NB: There’s actually a bug in that implementation of calloc! One of our tests would find it.)

You will work on our replacements for these functions, which are called m61_malloc, m61_free, m61_calloc, and m61_realloc. Our versions of these functions simply call basic versions, base_malloc and base_free. Note that the m61 functions take extra arguments that the system versions don’t, namely a filename and a line number. Our header file, m61.h, uses macros so that calls in the test programs supply these arguments automatically. You’ll use filenames and line numbers to track where memory was allocated and to report where errors occur.

Overview

Your debugging allocator must support the following functions. Our handout code contains tests for all this functionality (though we may run other tests when grading). From easier to harder:

Overall statistics—how many allocations/frees, how many bytes have been allocated/freed, etc.
Secondary allocation functions (calloc and realloc) and integer overflow protection.
Invalid free detection.
Writing past the beginning/end of an allocation.
Reporting memory that has been allocated, but not freed.
Advanced reports and checking.

We encourage each student to complete this work independently (coding without a partner). Our solution involves tens of lines of code (not hundreds), which is modest, but doing the work yourself will teach you a lot.

In addition to the debugging allocator, you must design and implement another useful tool, heavy hitter reports. We want you to design your solution, implement it, and test it. We expect you to work with a partner.

The grading breakdown:

70% tested debugging allocator functions (the test programs we hand out, plus others). If running make check reports *** All tests succeeded! you've probably got all these points.
10% coding style.
20% design and implementation of heavy hitter reports.

Timing

This problem set serves several purposes for us and you.

It gets you back into C programming, if you are a little rusty.
It teaches you about byte wrangling, sizes, and the memory representations of information. This is great preparation for the rest of class.
It introduces you to some common memory errors and shows how careful code can defend against them.
You’ll learn how malloc works!
It shows you some new algorithms.
It will be fun!

We are instituting a graduated submission process to ensure that you get started on the pset early. Only the last submission will be graded.

First, every individual student must commit and upload an initial checkin by midnight on Sunday 9/10 (one day later for extension). Try to do this work on your own. Your m61 should pass at least tests test001.c through test012.c (but remember, this submission will not be graded for completeness).
Second, every student or pair of students must commit and upload a final version, including heavy hitters, by Friday 9/15 (one day later for extension). Only the grade from this version will count.

In a team of a college student and an extension student (which we allow), the extension deadline counts.

It is important to get started early—this problem set is not trivial even though you don’t have to write that much code to make it work.

A note on undefined behavior

Debugging allocators have a nuanced relationship with undefined behavior. As we tell you in class, undefined behavior is a major no-no, because any program that invokes undefined behavior has no meaning. As far as the C language standard is concerned, once undefined behavior occurs, a program may do absolutely anything, such as force demons to fly out of your nose. Many of our tests (such as 17–30) explicitly invoke undefined behavior, and thus have no meaning. Yet your code must produce specific warnings for these cases! What gives?

Well, helpful debuggers catch common bugs, and bugs with malloc and free are disturbingly common. For this reason, debugging allocators take certain undefined behaviors and define them. For instance, when a debugging allocator is in force, a program like test020.c with a simple double free has defined behavior, namely crashing with a specific error message.

When writing a debugging allocator, it’s important to understand the properties of the underlying allocator. We have provided you with a very simple base memory allocator in basealloc.c. This allocator has the following properties:

Memory is allocated with base_malloc and freed with base_free.
Memory freed by base_free may be returned by a later base_malloc.
But base_free never overwrites freed memory or returns freed memory to the operating system. (This simple constraint makes it much easier to write a debugging allocator with base_malloc/free than with C’s default malloc/free.)

Thus, the following program is well-defined:

int main(int argc, char* argv[]) {
    int* x = base_malloc(sizeof(int));
    *x = 10;
    base_free(x);
    assert(*x == 10); // will always succeed
}

But double-frees and invalid frees are truly undefined, and the following program still has no meaning.

int main(int argc, char* argv[]) {
    int* x = base_malloc(sizeof(int));
    base_free(x);
    base_free(x); // ERROR ERROR ERROR
}

Debugging allocator (70%)

Statistics

Implement the following function:

void m61_getstatistics(struct m61_statistics* stats) Fill in the stats structure with overall statistics about memory allocations so far.

The m61_statistics structure is defined like this:

struct m61_statistics {
    unsigned long long nactive;           // number of active allocations [#malloc - #free]
    unsigned long long active_size;       // number of bytes in active allocations
    unsigned long long ntotal;            // number of allocations, total
    unsigned long long total_size;        // number of bytes in allocations, total
    unsigned long long nfail;             // number of failed allocation attempts
    unsigned long long fail_size;         // number of bytes in failed allocation attempts
    char* heap_min;                       // smallest address in any region ever allocated
    char* heap_max;                       // largest address in any region ever allocated
};

Most of these statistics are easy to track, and you should tackle them first. You can pass tests 1–10 without per-allocation metadata. The hard one is active_size: to track it, your free(ptr) implementation must find the number of bytes allocated for ptr.

The easiest, and probably best, way to do this is for your malloc code to allocate more space than the user requested. The first part of that space is used to store metadata about the allocation, including the allocated size. This metadata will not be a struct m61_statistics, it’ll be a structure you define yourself. Your malloc will initialize this metadata, and then return a pointer to the payload, which is the portion of the allocation following the metadata. Your free code will take the payload pointer as input, and then use address arithmetic to calculate the pointer to the corresponding metadata. This is possible because the metadata has fixed size. From that metadata it can read the size of the allocation. See CS:APP3e Figure 9.35 “Format of a simple heap block” for an example of this type of layout.

If you don’t like this idea, you could create a list or hash table

size_for_pointer` that mapped pointer values to sizes. Your `malloc

code would add an entry to this data structure. Your free code would check this table and then remove the entry.

Other aspects of the problem set will require you to add more stuff to the metadata.

Run make check to test your work. Test programs test001.c through test012.c test your overall statistics functionality. Open one of these programs and look at its code. You will notice some comments at the end of the file, such as this:

//! malloc count: active          0   total          0   fail          0
//! malloc size:  active          0   total          0   fail          0

These lines define the expected output for the test. The make check command checks your actual output against the expected output and reports any discrepancies. (It does this by calling compare.pl.)

Secondary allocation functions, integer overflow protection

Your debugging malloc library should support the secondary allocation functions calloc and realloc. It also needs to be robust against integer overflow attacks. (See, for example CS:APP3e Aside “Security vulnerability in the XDR library”, in section 2.3, p91.)

Our handout code’s m61_calloc and m61_realloc functions are close, but they don’t quite work. Fix them, and fix any other integer overflow errors you find.

Test programs test013.c through test016.c check your work.

Invalid free and double-free detection

m61_free(ptr, file, line) should print an error message and then call C’s abort() function when ptr does not point to active dynamically-allocated memory.

Some things to watch:

Be careful of calls like free((void*) 0x16), where the ptr argument is not NULL but it also doesn't point to heap memory. Your debugging malloc library should not crash when passed such a pointer. It should print an error message and exit in an orderly way. Test program test017.c checks this.
The test programs define the desired error message format. Here's our error message for test017:

MEMORY BUG: test017.c:9: invalid free of pointer 0xffffffffffffffe0, not in heap

Different error situations require different error messages. See test programs test017.c through test021.c.
Your code should print out the file name and line number of the problematic call to free().

Test programs test017.c through test027.c check your work.

Boundary write error detection

A boundary error is when a program reads or writes memory beyond the actual dimensions of an allocated memory block. An example boundary write error is to write the 11th entry in an array of size 10:

int* array = (int*) malloc(10 * sizeof(int));
...
for (int i = 0; i <= 10 /* WHOOPS */; ++i) {
    array[i] = calculate(i);
}

These kinds of errors are relatively common in practice. (Other errors can happen, such as writing to totally random locations in memory or writing to memory before the beginning of an allocated block, rather than after its end; but after-the-end boundary writes seem most common.)

A debugging memory allocator can't detect boundary read errors, but it can detect many boundary write errors. Your m61_free(ptr, file, line) should print an error message and call abort() if it detects that the memory block associated with ptr suffered a boundary write error.

No debugging malloc software can reliably detect all boundary write errors. For example, consider this:

int* array = (int*) malloc(10 * sizeof(int));
int secret = array[10];     // save boundary value
array[10] = 1384139431;     // boundary write error
array[10] = secret;         // restore old value! dmalloc can't tell there was an error!

Or this:

int* array = (int*) malloc(10 * sizeof(int));
array[200000] = 0;          // a boundary write error, but very far from the boundary!

We’re just expecting your code to catch common simple cases. You should definitely catch the case where the user writes one or more zero bytes directly after the allocated block.

Test programs test028.c through test030.c check your work.

Memory leak reporting

A memory leak happens when code allocates a block of memory but forgets to free it. Memory leaks are not as serious as other memory errors, particularly in short-running programs. They don't cause a crash directly. (The operating system always reclaims all of a program's memory when the program exits.) But in long-running programs, such as your browser, memory leaks have serious effect and are important to avoid.

Write a m61_printleakreport() function that, when called, prints a report about every allocated object in the system. This report should list every object that has been **malloc()**ed but not **free()**d. Print the report to standard output (not standard error). A report should look like this:

LEAK CHECK: test033.c:23: allocated object 0x9b811e0 with size 19
LEAK CHECK: test033.c:21: allocated object 0x9b81170 with size 17
LEAK CHECK: test033.c:20: allocated object 0x9b81140 with size 16
LEAK CHECK: test033.c:19: allocated object 0x9b81110 with size 15
LEAK CHECK: test033.c:18: allocated object 0x9b810e0 with size 14
LEAK CHECK: test033.c:16: allocated object 0x9b81080 with size 12
LEAK CHECK: test033.c:15: allocated object 0x9b81050 with size 11

A programmer would use this leak checker by calling m61_printleakreport() before exiting the program, after cleaning up all the memory they could using free() calls. Any missing **free()**s would show up in the leak report.

To implement a leak checker, you'll need to keep track of every active allocated block of memory. It's easiest to do this by adding more information to the block metadata. You will use the file and line arguments to m61_malloc()/m61_realloc()/m61_calloc().

Note: You may assume that the file argument to these functions has static storage duration. This means you don't need to copy the string’s contents into your block metadata—it is safe to use the string pointer.

Test programs test031.c through test033.c check your work.

Advanced reports and checking

Test programs test034.c, test035.c, and test036.c require you to update your reporting and error detection code to print better information and defend against more diabolically invalid frees. You will need to read the test code and understand what is being tested to defend against it.

Update your invalid free message. After determining that a pointer is invalid, your code should check whether the pointer is inside a different allocated block. This will use the same structures you created for the leak checker. If the invalid pointer is inside another block, print out that block, like so:

MEMORY BUG: test034.c:10: invalid free of pointer 0x833306c, not allocated
 test034.c:9: 0x833306c is 100 bytes inside a 2001 byte region allocated here

And make sure your invalid free detector can handle the diabolical situations in the other tests. What situations? Check the test code to find out!

Style (10%)

Your code should be clean, clear, correct, and consistent. The most important style guideline is consistency. Our handout code follows a certain format (indentation amount, comment usage, variable names). We won't grade you down if you choose a different format. But pick a format. Don't write code that changes style from line to line.

For this problem set, additional coding guidelines will be incorporated into your grade.

Document your metadata. Explain what additional data structures you used to track debugging and heavy-hitter information.
Your metadata should be compact. The debugging allocator should introduce at most tens of bytes of overhead per allocation, not hundreds or thousands.
Your malloc and free functions should be relatively fast. For full credit, you should add only O(1) additional work to a successful (i.e., non-erroneous) malloc or free call. (Your heavy-hitter reports implementation might be more expensive than the other parts.)
Your implementation should not introduce memory errors or undefined behaviors of its own! Check your code by running tests under valgrind or using make SANITIZE=1 check-all. Our solutions succeed cleanly under SANITIZE=1 for tests 1–29, except for memory leaks from client code. (Later tests don’t succeed cleanly only because in tests 30, 35, and 36, the test code directly invokes undefined behavior that upsets the sanitizer.)

Heavy hitter reports (20%)

Memory allocation is one of the more expensive things a program can do. It is possible to make a program run much faster by optimizing how that program uses malloc and by optimizing malloc itself. (Did you know that both Google and Facebook have malloc specialists? Here’s Google’s tcmalloc, and Facebook liked jemalloc so much that they hired Jason Evans.)

But before optimizing a program, we must measure that program’s performance. Programmer intuition is frequently wrong: programmers tend to assume the slowest code is either the code they found most difficult to write or the last thing they worked on. Thus, before optimizing anything, you want to have data to guide your optimization. In this case, it useful to have a memory allocation profiler—a tool that tracks and reports potential memory allocation problems.

Your job is to design and implement a particular kind of profiling, heavy hitter reports, for your memory allocator. This has two parts. You will:

Track the heaviest users of malloc() by code location (file and line). A “heavy” location is a location that is responsible for allocating many bytes.
Generate a readable report that summarizes this information.

Rule 1: If a program makes lots of allocations, and a single line of code is responsible for 20% or more of the total bytes allocated by a program, then your heavy-hitter report should mention that line of code (possibly among others).

Rule 2: Your design should handle both large numbers of allocations and large numbers of allocation sites. In particular, you should be able to handle a program that calls malloc() at 10,000 different file-line pairs.

Rule 3: Your report should include some information that helps the user decide which lines are likely to be the heaviest hitters, such as exact or estimated byte counts, or even just by ordering the hitters.

How should you implement this? That’s up to you, but here are some pointers.

Sampling is acceptable. It would be OK, for example, to sample 1/100th of all allocations and report information for only the sampled allocations. This can cut down the amount of data you need to store.
- You could sample exactly every Nth allocation, but random sampling is usually better, since it avoids synchronization effects. (For instance, if the program cycled among 4 different allocation sites, then sampling every 20th allocation would miss 75% of the allocation sites!) For random sampling you'll need a source of randomness. Use random() or drand48().
Clever, yet easy, algorithms developed quite recently can help you catch all heavy hitters with O(1) space and simple data structures!
- A Simple Algorithm for Finding Frequent Elements in Streams and Bags, Karp, Shenker, and Papadimitriou
- Frequency Estimation of Internet Packet Streams with Limited Space, Demaine, López-Ortiz, and Munro. The paper's context doesn't matter; the relevant algorithms, “Algorithm MAJORITY” and “Algorithm FREQUENT,” appear on pages 6-7, where they are simply and concisely presented. (You want FREQUENT, but MAJORITY is helpful for understanding.)
- YOU DO NOT NEED TO USE THESE ALGORITHMS! But why not take a look? They’re surprisingly simple.

We provide a test program for you to test heavy hitter reports, hhtest.c. This program contains 40 different allocators that allocate regions of different sizes. Its first argument, the skew, varies the relative probabilities that each allocator is run.

You should change hhtest.c to call your heavy hitter printer just before main exits.

Running ./hhtest 0 will call every allocator with equal probability. But allocator #39 (which is at httest.c:47) allocates twice as much data as any other. So when we run our dirt-simple heavy hitter detector against ./hhtest 0, it reports:

HEAVY HITTER: hhtest.c:49: 1643786191 bytes (~50.1%)
HEAVY HITTER: hhtest.c:48: 817311692 bytes (~25.0%)
HEAVY HITTER: hhtest.c:47: 403156951 bytes (~12.4%)

(Your detector doesn’t need to follow this output format.)

If we run ./hhtest 1, however, then the first allocator (hhtest.c:10) is called twice as often as the next allocator, which is called twice as often as the next allocator, and so forth. There is almost no chance that allocator #39 is called at all. The report for ./hhtest 1 is:

HEAVY HITTER: hhtest.c:10: 499043 bytes (~50.0%)
HEAVY HITTER: hhtest.c:11: 249136 bytes (~25.0%)
HEAVY HITTER: hhtest.c:12: 123995 bytes (~12.5%)

At some intermediate skews, though, and there may be no heavy hitters at all. Our code reports nothing when run against ./hhtest 0.4.

Negative skews call the large allocators more frequently. ./hhtest -0.4:

HEAVY HITTER: hhtest.c:49: 15862542908 bytes (~62.1%)
HEAVY HITTER: hhtest.c:48: 6004585020 bytes (~23.5%)

Try ./hhtest --help to get a full description of hhtest’s arguments. You should test with many different arguments; for instance, make sure you try different allocation “phases.” A great software engineer would also create tests of her own; we encourage you to do this!

This idea can be taken quite far. Google, for example, links a heavy-hitter detector with many important servers. It is possible (within Google) to connect to many servers and generate a graph of its current heavy hitter allocation sites, including their calling functions and relationships among functions. Here's a little example (scroll down the page); here's a bigger one. Have fun!!

Extra credit

We encourage you to learn more and flex your coding muscles by doing some extra credit work! We will grade extra credit attempts, but we aren’t publishing any formula about how extra credit feeds into final grades, and extra credit is in no sense required to get an A.

Implement a particularly awesome heavy-hitter detector. For example, make your heavy-hitter detector find frequent allocations as well as heavy allocations. Allocating a single byte a billion times is expensive, even if the byte is immediately freed each time! So augment your heavy hitter detector to report call sites that are responsible for 20% or more of the total number of allocations the program made, even if those call sites aren’t responsible for many total bytes relative to other call sites.
Extend your debugging allocator. For instance, include a full heap checker, which should cross-check heap data structures to see if any wild writes have corrupted internal heap state. Check with us if you're not sure whether your extension is worthwhile.

Turnin

You will turn in your code using Git. In the pset1/ subdirectory, you MUST include two text files called README.md and AUTHORS.md. The README.md file will include any extra credit attempted and comments for graders. The AUTHORS.md file should list your names as well as any other collaborators and citations. (Just list "None" if you have none, but we'd be surprised if you had no citations for the heavy hitter part of the pset.)

Your submission will not be graded until you configure the grading server. Log in and enter your code.seas username and repository URL. Click "pset1" to initiate a fetch of your repository from code.seas. Verify that all tests that you expect to pass continue to pass on the grading server.

Notes

This lab was originally created for CS 61.