Assignment 1: Debugging memory allocator

Assigned Fri 9/14
Due Fri 9/28 at 11:59pm
This assignment may be completed in pairs, although we encourage students to try at least the early parts individually.
We'll be using virtual machines for programming problem sets. See the Infrastructure page to see how to obtain and install a VM.

Goal

C allows users to allocate and free memory explicitly. This means (as we saw in Lecture 1) that we can write fast code for modern machines, because we have full control over memory. But, depressingly more often, it means that us programmers can easily crash our programs.

Explicit memory management is thus one of C’s glories and one of its biggest disasters.

C gives pain; it can also take pain away. We have the power to build tools to help us debug memory allocation problems. For instance, we can write a debugging memory allocator.

Task

Write a debugging malloc library that catches memory errors in our handout programs (and other programs), and can help debug memory usage.

The handout code for this pset is in the cs61-psets/pset1 directory. This contains a skeleton memory allocator, which just calls malloc and free, and many test programs.

Your code for this problem set will go in m61.c. Run make check or make check-all to check your work. make check stops at the first error; make check-all runs all tests. Run make check-TESTNUMBER to check a specific test.

Context

C memory allocation uses two basic functions, malloc and free.

void *malloc(size_t size) Allocate size bytes of memory and return a pointer to the dynamically allocated memory. This memory is not initialized (it can contain anything). Return NULL if the allocation failed (because size was too big, or memory is exhausted, or for whatever other reason).

void free(void *ptr) Free a single block of memory previously allocated by malloc.

The rules of malloc and free are simple: Allocate once, then free once.

Dynamically-allocated memory remains active until it explicitly freed with a call to free.
A successful call to malloc(sz) returns a pointer ptr to “new” dynamically allocated memory. This means that the sz bytes of data starting at address ptr are guaranteed not to overlap with the program's code, its global variables, its stack variables, or with any other active dynamically allocated memory.
The pointer argument in free(ptr) must either equal NULL or a pointer to active dynamically-allocated memory. In particular:
- It is not OK to call free(x) if x points to the program's code, or into its global variables, or into the stack.
- It is not OK to call free(x) unless x was returned by a previous call to malloc.
- It is not OK to call free(x) if x is currently inactive (i.e., free(x) was previously called with the same pointer argument, and the x memory block has not been reused by an intervening malloc()).

The first two errors are called invalid frees. The third error is called a double free.

Some notes on boundary cases:

malloc(0) may return either NULL or a non-NULL pointer. If ptr = malloc(0) is not NULL, then ptr does not overlap with any other allocation and can be passed to free().
free(NULL) is allowed. It does nothing.
malloc(sz) returns memory whose alignment works for any object. (We’ll discuss alignment in class; for a preview, see CS:APP2e §3.9.3.) On 32-bit x86 machines, this means that the address value returned by malloc() must be evenly divisible by 4—even if sz < 4. But in practice, 32-bit malloc implementations return memory that is 8-byte aligned, meaning that the returned address is evenly divisible by 8. (On x86-64/IA-64 machines, the maximum data alignment is 8, but malloc implementations return 16-byte aligned memory.)

Two secondary memory allocation functions are also commonly used in C, calloc and realloc. The calloc function allocates memory and clears it to zero. The realloc function can allocate, free, or resize memory depending on its arguments. We can write these functions like so:

 void *calloc(size_t nmemb, size_t sz) {
     void *ptr = malloc(sz * nmemb);
     if (ptr != NULL)
         memset(ptr, 0, sz * nmemb);     // clear memory to 0
     return ptr;
 }

 void *realloc(void *ptr, size_t sz) {
     void *new_ptr = NULL;
     if (sz != 0)
         new_ptr = malloc(sz);
     if (ptr != NULL && new_ptr != NULL) {
         size_t old_sz = `**`size of memory block allocated at ptr`**`;
         if (old_sz < sz)
             memcpy(new_ptr, ptr, old_sz);
         else
             memcpy(new_ptr, ptr, sz);
     }
     free(ptr);
     return new_ptr;
 }

Assignment

Your debugging allocator must support the following functionality. Our handout code contains tests for all this functionality (though we may run other tests when grading).

Overall statistics.
Secondary allocation functions and integer overflow protection.
Invalid free and double-free detection.
Boundary write error detection.
Memory leak reporting.
Advanced reports and checking.

This functionality is ordered from easier to harder. We encourage students to try at least this functionality on their own. Our solution involves tens of lines of code (not hundreds), which is modest, but doing the work yourself will teach you a lot.

In addition to the debugging allocator, you must design and implement another useful function. We want you to design your solution, implement it, and test it.

Heavy hitter reports.

The grading breakdown:

65% tested debugging allocator functionality (the test programs we hand out, plus others). If running make check reports *** All tests succeeded! you've probably got all these points.
15% coding style.
20% design and implementation of heavy hitter reports.

Debugging allocator (65%)

Overall statistics

Implement the following function:

void m61_getstatistics(struct m61_statistics *stats) Fill in the stats structure with overall statistics about memory allocations so far.

The m61_statistics structure is defined like this:

struct m61_statistics {
    unsigned long long active_count;      // number of active allocations [#malloc - #free]
    unsigned long long active_size;       // number of bytes in active allocations
    unsigned long long total_count;       // number of allocations, total
    unsigned long long total_size;        // number of bytes in allocations, total
    unsigned long long fail_count;        // number of failed allocation attempts
    unsigned long long fail_size;         // number of bytes in failed allocation attempts
};

Many of these statistics are quite easy to track. The hard one is active_size, which requires that your free(ptr) implementation determine the number of bytes allocated for ptr.

There are several ways you could accomplish this goal.

You could create a hash table size_for_pointer that mapped pointer values to sizes. Your malloc code would add an entry to the hash table. Your free code would check this table and then remove the entry.
You could require the user to pass in the size of the allocated area. But this would change the interface to free—something we don’t allow.¹

The easiest, and probably best, way is for your malloc code to allocate more space than the user requested. The first part of that space is used to store metadata about the allocated chunk, including its size. Your malloc will initialize this metadata, and then return a pointer to the payload, which is the space following the metadata. This calculation involves simple address arithmetic. Your free code then will take the payload pointer as input, and then use address arithmetic to calculate the pointer to the metadata. From that metadata it can read the size. See CS:APP2e Figure 9.35 “Format of a simple heap block” for an example of this type of layout.

Other aspects of the problem set will require you to add more stuff to the metadata.

Run make check to test your work. Test programs test001.c through test006.c test your overall statistics functionality. Open one of these programs and look at its code. You will notice some comments at the end of the file, such as this:

//! malloc count: active          0   total          0   fail          0
//! malloc size:  active          0   total          0   fail          0

These lines define the expected output for the test. The compare.pl script checks your actual output against the expected output and reports any discrepancies.

Secondary allocation functions, integer overflow protection

Your debugging malloc library should support the secondary allocation functions calloc and realloc. It also needs to be robust against integer overflow attacks. (See, for example CS:APP2e Aside “Security vulnerability in the XDR library”, in section 2.3, p91.)

It’s easiest support calloc and realloc by writing versions of the code above, so that your m61_calloc and m61_realloc functions call your m61_malloc and m61_free functions directly. But be careful of overflow!

Test programs test007.c through test010.c check your work.

Invalid free and double-free detection

m61_free(ptr, file, line) should print an error message and then call C’s abort() function when it can tell that ptr does not point to active dynamically-allocated memory.

Some things to watch:

Be careful of calls like free((void *) 0x1), where the ptr argument is not NULL but it also doesn't point to valid memory. Your debugging malloc library should not crash when passed such a pointer. It should print an error message and exit in an orderly way. Test program test012.c checks this. (You may assume that the system malloc() allocates memory in contiguous ranges of addresses.)
The test programs define the desired error message format. Here's our error message for test011:

MEMORY BUG: test011.c:9: invalid free of pointer 0x7fff683fab8c, not in heap

Different error situations require different error messages. See test programs test011.c through test014.c.
Your code should print out the file name and line number of the problematic call to free(). Check out m61.h to see how this information is passed to your code.

Test programs test011.c through test019.c check your work.

Boundary write error detection

A boundary error is when a program reads or writes memory beyond the actual dimensions of an allocated memory block. An example boundary write error is to write the 11th entry in an array of size 10:

int *array = (int *) malloc(10 * sizeof(int));
...
for (int i = 0; i <= 10 /* WHOOPS */; ++i)
    array[i] = calculate(i);

These kinds of errors are relatively common in practice. (Other errors can happen, such as writing to totally random locations in memory, or writing to memory before the beginning of an allocated block, rather than after its end; but after-the-end boundary writes seem most common.)

A debugging memory allocator can't detect boundary read errors, but it can detect many boundary write errors. Your m61_free(ptr, file, line) should print an error message and call abort() if it detects that the memory block associated with ptr suffered a boundary write error.

No debugging malloc could possibly detect all boundary write errors. For example, consider this:

int *array = (int *) malloc(10 * sizeof(int));
int secret = array[10];     // save boundary value
array[10] = 1384139431;     // boundary write error
array[10] = secret;         // restore old value! dmalloc can't tell there was an error!

Or this:

int *array = (int *) malloc(10 * sizeof(int));
array[200000] = 0;          // a boundary write error, but very far from the boundary!

We’re just expecting your code to catch common simple cases. You should definitely catch the case where the user writes one or more zero bytes directly after the allocated block.

Test programs test020.c, test027.c, test028.c check your work. (Tests 27 and 28 were not available in the initial pset release.)

Memory leak reporting

A memory leak happens when code allocates a block of memory but forgets to free it. Memory leaks are not as serious as other memory errors, particularly in short-running programs. They don't cause a crash directly. (The operating system always reclaims all of a program's memory when the program exits.) But in long-running programs, such as your browser!, memory leaks have serious effect and are important to avoid.

Write a m61_printleakreport() function that, when called, prints a report about every allocated object in the system. This report should list every object that has been **malloc()**ed but not **free()**d. Print the report to standard output (not standard error). A report should look like this:

LEAK CHECK: test023.c:23: allocated object 0x9b811e0 with size 19
LEAK CHECK: test023.c:21: allocated object 0x9b81170 with size 17
LEAK CHECK: test023.c:20: allocated object 0x9b81140 with size 16
LEAK CHECK: test023.c:19: allocated object 0x9b81110 with size 15
LEAK CHECK: test023.c:18: allocated object 0x9b810e0 with size 14
LEAK CHECK: test023.c:16: allocated object 0x9b81080 with size 12
LEAK CHECK: test023.c:15: allocated object 0x9b81050 with size 11

A programmer would use this leak checker by calling m61_printleakreport() before exiting the program, after cleaning up all the memory they could using free() calls. Any missing **free()**s would show up in the leak report.

To implement a leak checker, you'll need to keep track of every active allocated block of memory. It's easiest to do this by adding more information to the block metadata. You will use the file and line arguments to m61_malloc()/m61_realloc()/m61_calloc().

Note: You may assume that the file argument to these functions has static storage duration. This means you don't need to copy the string’s contents into your block metadata—it is safe to use the string pointer.

Test programs test021.c through test023.c check your work.

Advanced reports and checking

Finally, test programs test024.c, test025.c, and test026.c require you to update your reporting and error detection code. You'll print better information and defend against more diabolically invalid frees.

Update your invalid free message. After determining that a pointer is invalid, your code should check whether the pointer is inside a different allocated block. This will use the same structures you created for the leak checker. If the invalid pointer is inside another block, print out that block, like so:

MEMORY BUG: test024.c:10: invalid free of pointer 0x833306c, not allocated
 test024.c:9: 0x833306c is 100 bytes inside a 2001 byte region allocated here

And make sure your invalid free detector can handle the diabolical situations in the other tests. What situations? Check the test code to find out!

Style (15%)

Your code should be clean, clear, correct, and consistent. The most important style guideline is consistency. Our handout code follows a certain format (indentation amount, comment usage, variable names). We won't grade you down if you choose a different format. But pick a format. Don't write code that changes style from line to line.

For this problem set, additional coding guidelines will be incorporated into your grade.

Your metadata choice should be well explained and compact. The debugging allocator should introduce at most tens of bytes of overhead per allocation, not hundreds or thousands.
Your malloc and free functions should be relatively fast. Ideally, you should add only O(1) additional work to a successful (i.e., non-erroneous) malloc or free call, but this is not a strict requirement. (Your heavy-hitter reports implementation might be more expensive than the other parts.)

Heavy hitter reports (20%)

Memory allocation is one of the more expensive things a program can do. It is possible to make a program run much faster by optimizing how that program uses malloc.

(Did you know that both Google and Facebook have malloc specialists? Here's Google's tcmalloc, and Facebook liked jemalloc so much that they hired Jason Evans.)

But before optimizing a program, we must measure that program's performance. Many large programs therefore include a memory allocation profiler to track and report potential memory allocation problems.

Your job is to design and implement heavy hitter reports for your memory allocator. This has two parts. You will:

Track the most frequent and most expensive (“heavy”) users of malloc() by code location (file and line).
Generate a readable report that summarizes this information.

Large memory allocations are expensive (such as allocating a billion bytes). Frequent memory allocations are expensive too (such as allocating a single byte, a billion times—even if the byte is freed each time). Of course, frequent large memory allocations are worst of all (such as allocating a billion bytes a billion times). Your system should track both large and frequent memory allocations.

Rule 1: If a program makes lots of allocations, and a single line of code is responsible for either 25% or more of the total bytes allocated by a program, or 25% or more of the total allocations the program made, or both, then your heavy-hitter report should mention that line of code (possibly among others).

Rule 2: Your design should be able to handle both large numbers of allocations and large numbers of allocation sites. In particular, you shouldn't crash or slow down too much just because a program calls malloc() at 10,000 different file-line pairs.

Rule 3: Your report should include some information that helps the user decide which lines are likely to be the heaviest hitters, such as exact or estimated counts of allocations and/or bytes, or even just by ordering the hitters.

How should you implement this? That's up to you, but we're happy to give you pointers.

Sampling is acceptable. It would be OK, for example, to sample 1/100th of all allocations, and report information for only the sampled allocations. This can cut down the amount of data you need to store.
- You could sample exactly every Nth allocation, but random sampling is usually better, since it avoids synchronization effects. (For instance, if the program cycled among 4 different allocation sites, then sampling every 20th allocation would miss 75% of the allocation sites!) For random sampling you'll need a source of randomness. Use random() or drand48().
Clever, yet easy, algorithms developed quite recently can help you catch all heavy hitters with O(1) space and simple data structures!
- A Simple Algorithm for Finding Frequent Elements in Streams and Bags, Karp, Shenker, and Papadimitriou
- Frequency Estimation of Internet Packet Streams with Limited Space, Demaine, López-Ortiz, and Munro. The paper's context doesn't matter; the relevant algorithms, “Algorithm MAJORITY” and “Algorithm FREQUENT,” appear on pages 6-7, where they are very simply and concisely presented. (You want FREQUENT, but MAJORITY is helpful for understanding.)
- YOU DO NOT NEED TO USE THESE ALGORITHMS! But why not take a look? You might be surprised at how useful some research can be.

We provide a very simple test program for you to test heavy hitter reports, hhtest.c. This program contains 40 different allocators which allocate regions of different sizes. Its first argument, the skew, varies the relative probabilities among the allocators.

For example, if you run hhtest 0, then every allocator is called with equal probability. There are no heavy hitters in terms of allocation counts. However, allocator #39 allocates twice as much data as any other. When we run our dirt-simple heavy hitter detector against hhtest 0 (after adding a call to our heavy-hitter printer to hhtest.c), it reports:

HEAVY HITTER: hhtest.c:47: 1643786191 bytes (~50.1%)
HEAVY HITTER: hhtest.c:46: 817311692 bytes (~25.0%)
HEAVY HITTER: hhtest.c:45: 403156951 bytes (~12.4%)

(Your detector doesn’t need to follow this output format.)

If we run hhtest 1, however, then the first allocator (hhtest.c:7) is called twice as often as the next allocator, which is called twice as often as the next allocator, and so forth. There is almost no chance that allocator #39 is called at all. The report for hhtest 1 is:

HEAVY HITTER: hhtest.c:8: 499043 count (~50.0%)
HEAVY HITTER: hhtest.c:9: 249136 count (~25.0%)
HEAVY HITTER: hhtest.c:10: 123995 count (~12.5%)
HEAVY HITTER: hhtest.c:8: 499043 bytes (~50.0%)
HEAVY HITTER: hhtest.c:9: 249136 bytes (~25.0%)
HEAVY HITTER: hhtest.c:10: 123995 bytes (~12.5%)

But try some intermediate skews and you’ll see the heavy-hitter detector report both frequently-called allocation sites and large-size allocation sites. Here’s a report for hhtest 0.3:

HEAVY HITTER: hhtest.c:8: 130207 count (~18.8%)
HEAVY HITTER: hhtest.c:47: 2911492 bytes (~31.4%)
HEAVY HITTER: hhtest.c:46: 2227040 bytes (~24.7%)

Negative skews call the large allocators more frequently. hhtest -0.4:

HEAVY HITTER: hhtest.c:47: 206194 count (~24.2%)
HEAVY HITTER: hhtest.c:46: 147420 count (~18.3%)
HEAVY HITTER: hhtest.c:45: 103605 count (~13.9%)
HEAVY HITTER: hhtest.c:47: 15862542908 bytes (~62.1%)
HEAVY HITTER: hhtest.c:46: 6004585020 bytes (~23.5%)

This idea can be taken quite far. Google, for example, links a heavy-hitter detector with many important servers. It is possible (within Google) to connect to many servers and generate a graph of its current heavy hitter allocation sites, including their calling functions and relationships among functions. Here's a little example (scroll down the page); here's a bigger one. Have fun!!

Extra credit opportunities

Implement a particularly awesome heavy-hitter detector.
Extend your debugging allocator. For instance, include a full heap checker, which should cross-check heap data structures to see if any wild writes have corrupted internal heap state. Check with us if you're not sure whether your extension is worthwhile.

Turnin

You will turn in your code using Git. In the pset1/ subdirectory, you MUST include a text file called README.txt. This file should list:

Your name and HUID.
Your partner’s name and HUID, if any.
Other collaborators and citations.
Comments for the graders, if any.

Notes

This lab was created for CS 61, Fall 2012.

Your m61_free function does take filename and line number arguments, unlike normal free. But we use macros to supply these arguments automatically. Check m61.h to see how. ↩︎