Datarep Section 2: Memory bugs

C and C++ grant programmers fine-grained, low-level control over memory, including the ability to manufacture pointers to objects from integers. This is powerful but dangerous. It lets us write hexdump, for example, and examine memory as a program runs. It also lets programmers make mistakes, and those mistakes are central to the “disastrously central role that [C] has played in our ongoing computer security nightmare”. This section is about recognizing those mistakes.

Preread: Live object rule & Kinds of memory error

Live object rule

Most memory errors boil down to violations of this requirement:

Live object rule. Every memory access made by a C++ program must be to a live object, meaning an object within its lifetime (having storage that has been allocated and has not yet been released).

Different kinds of object storage have different lifetimes. Lifetimes generally correlate to the segments in which objects are stored.

Static lifetime (global variables in code and data segments): These objects live for as long as the program.
Automatic lifetime (local variables in the stack segment): These objects live for as long as their containing block. They are allocated at their point of declaration and released when the program exits the corresponding block.
Dynamic lifetime (dynamically-allocated variables in the heap segment): These objects have user-defined lifetime. They are allocated by an allocation function (malloc or new) and released by a deallocation function (free or delete).

This program violates the live object rule four times: once for an automatic-lifetime object, once for a dynamic-lifetime object, once for a static-lifetime object, and once for no object at all.

int global[10];


int main() {
    int* ptr;

    {
        int local;
        ptr = &local;
    }
    *ptr = 2;     // VIOLATION: `local`’s lifetime has ended


    ptr = new int;
    delete ptr;
    *ptr = 2;     // VIOLATION: the dynamically allocated int’s lifetime has ended


    ptr = &global[10];
    *ptr = 2;     // VIOLATION: no object at `global[10]`, past bounds of array


    ptr = (int*) 0x90127592174982;
    *ptr = 2;     // almost certainly VIOLATION: points to no object
}

Kinds of memory error

Memory errors are common enough in C and C++ that some frequently-occurring classes of error have specific names.

Invalid access: A read or write access that doesn’t access an object. Many kinds of invalid access have their own names, including:
1. Null pointer dereference: A read or write access to a null pointer. (The null pointer never points to a live object.)
2. Boundary read/write error: A read or write access immediately past the beginning or end of an object.
3. Out-of-bounds access: A read or write access past the boundaries of an array.
4. Stack buffer overflow: A write access past the end of an array with automatic lifetime (a local variable).
5. Unaligned access: A read or write access to a pointer that’s not properly aligned for its type. (Every live object has a properly-aligned address.)
6. Wild read/write: A read or write access to a pointer that’s nowhere near a live object.
7. Aliasing violation: An access using a pointer that doesn’t correspond to the object type stored at that address (for instance, accessing a std::string using an int* pointer).
Use-after-free: A read or write access to an object after its lifetime has ended. (This term is commonly applied when a program accesses a dynamically-allocated object after it was explicitly freed, but it is also possible to access a local object after it’s deallocated.)
Invalid free: Passing a pointer to free (or delete) that was not allocated by malloc (or new).
Double free: Freeing dynamically-allocated memory that has already been freed.
Modify const: A write access to an object that is read-only (was originally declared const).

Invalid accesses and use-after-free errors violate the live object rule. Invalid and double frees violate the rules for dynamic memory allocation, but don’t violate the live object rules in and of themselves.

Sometimes a program that causes a memory error will crash right away; for instance, dereferencing a null pointer almost always crashes the program. This is good! We want errors to crash the program—a reliable crash is far easier to debug. Sometimes, though, the consequences of an error are more insidious. An error might cause no problem on some OSes, and cause a huge security hole in others. Sanitizers (make SAN=1) will catch many errors, but not all.

Memory safe languages (offline)

Many other programming languages are memory-safe, meaning that in those languages, it is impossible to write a program that violates the live object rule. The key technology that makes memory safety possible is automatic garbage collection. In an automatically-garbage-collected language like Java, Python, or Javascript, every object is dynamically allocated, but the programmer never needs to call an explicit deallocation function. Instead, the language runtime library will periodically search the program’s memory space for objects that can no longer be referenced, and free any it finds.

Automatic garbage collection is wonderful, but heap-allocating every object can be expensive, so many languages use compiler analysis to determine which heap-allocated objects can actually be placed on the stack.

Simple examples

In the next parts of section, we’ll ask you to analyze a bunch of programs—perhaps in groups—and then gather back together and discuss what you’ve learned. For each program, please answer:

Does the program contain a memory error? Does it violate the live object rule?
What kinds of memory errors does it contain, if any?

Code for the examples is in the cs61-sections/datareps2 directory, but try to find an answer on your own before running the compiler.

Exercise S1. Are there memory errors, and if so, which ones?

datareps2/s1.cc
int f(int x) {
    return x;
}

int main() {
    printf("%d\n", f(2));
}
No. The x object has automatic lifetime, and x is released when f returns. However, f doesn’t return the x object, it returns a copy of x. And that copy remains alive until it’s printed. (A temporary object, like the computed value of f(2), lives until the end of the statement that created it, which here is the printf call.)

Exercise S2. Are there memory errors, and if so, which ones?

datareps2/s2.cc
int global = 0;

int* f(int x) {
    return &global;
}

int main() {
    int* ptr = f(2);
    printf("%d\n", *ptr);
}
No. The global object global has static lifetime, so it lasts for as long as the program, so it’s safe to access that object through a pointer.

Exercise S3. Are there memory errors, and if so, which ones?

datareps2/s3.cc
int* f(int x) {
    return &x;
}

int main() {
    int* ptr = f(2);
    printf("%d\n", *ptr);
}
Yes! The local object x has automatic lifetime. By the time it is accessed in main (via *ptr), x’s lifetime has ended. A function should never return a pointer to one of its local variables. The compiler knows this and will warn you for simple cases like this one:
c++  -std=gnu++1z -W -Wall -Wshadow -g   -MD -MF .deps/s3.d -MP -O3 -o s3.o -c s3.cc
s3.cc:4:13: warning: address of stack memory associated with parameter 'x' returned [-Wreturn-stack-address]
    return &x;
            ^
1 warning generated.
This is an example of a use-after-free error.

Exercise S4. Are there memory errors, and if so, which ones?

datareps2/s4.cc
int main() {
    int* ptr;
    int x = 2;
    ptr = &x;
    printf("%d\n", *ptr);
}
No.

Exercise S5. Are there memory errors, and if so, which ones?

datareps2/s5.cc
int main() {
    int* ptr;
    {
        int x = 2;
        ptr = &x;
    }
    printf("%d\n", *ptr);
}
Yes. x is released at the end of its containing block, so the *ptr access violates the rule, just as in L3. The compiler doesn’t warn about this one, though. This is another use-after-free error.

Buffer examples

These examples turn to programs that involve buffers, which are character arrays (including strings). Programs that read input do so into one or more buffers.

It’s worth noting that many C interfaces for I/O are vulnerable to stack buffer overflows, and these errors have caused critical software vulnerabilities over the years. For instance, the gets standard library function, which reads a line from standard input but offers no way to limit the number of characters read, is so dangerous that many operating systems now prohibit its use. The CWE (“Common Weakness Enumeration”) database has some illustrative examples of buffer overflows.

Exercise B1. Are there memory errors, and if so, which ones?

datareps2/b1.cc
const char* f(int x) {
    return "example";
}

int main() {
    printf("%s\n", f(2));
}
No! C string constants such as "example" have static lifetime (even though they are represented as pointers), and it’s perfectly safe to return them from functions.

Exercise B2. Are there memory errors, and if so, which ones?

datareps2/b2.cc
const char* f(int x) {
    char buf[100] = "example";
    return buf;
}

int main() {
    printf("%s\n", f(2));
}
Yes! This isn’t returning a C string constant, but rather a pointer to an array with automatic lifetime, which violates the live object rule. Again this represents a use-after-free error.

Exercise B3. Are there memory errors, and if so, which ones?

datareps2/b3.cc
const char* f(int x) {
    char buf[100] = "example";
    return buf;
}

int main() {
    printf("%p\n", f(2));
}
No! Although, as in L7, this f function returns a pointer to an array with automatic lifetime, that pointer is not dereferenced (the %p conversion prints the pointer’s address value, not its contents) and the live object rule is obeyed. (The code is still bad but for a different reason.)

Exercise B4. Are there memory errors, and if so, which ones?

datareps2/b4.cc

const char* f(int x) {
    char* buf = new char[100];
    strcpy(buf, "example");
    return buf;
}

int main() {
    const char* s = f(2);
    printf("%s\n", s);
    delete[] s;
}

Exercise B5. Are there memory errors, and if so, which ones?

datareps2/b5.cc
const char* f(int x) {
    char* buf = new char[100];
    strcpy(buf, "example");
    return buf;
}

int main() {
    const char* s = f(2);
    delete[] s;
    printf("%s\n", s);
}
Yes. It’s illegal to dereference s after freeing it—classic use-after-free error.

Exercise B6. Are there memory errors, and if so, which ones?

datareps2/b6.cc
const char* f(int) {
    char* buf = new char[strlen("example")];
    strcpy(buf, "example");
    return buf;
}

int main() {
    const char* s = f(2);
    printf("%s\n", s);
    delete[] s;
}
Yes. The call to strcpy writes a zero byte past the end of the 7-byte array allocated for buf. This is an invalid access. We can classify it most precisely as a boundary write error (the invalid access is immediately after the object).

Hulk buffers (optional exercises)

Consider this function.

datareps2/hulk.cc

const char* hulk_players[] = {
    "RUFFALO",
    "BANA",
    "FERRIGNO",
    "MIYAUCHI"
};

void hulk_greeting(const char* name) {
    char uc_name[16];

    // make uppercase version of `name`
    for (size_t i = 0; name[i] != 0; ++i) {
        uc_name[i] = toupper(name[i]);
    }
    uc_name[strlen(name)] = 0;
    printf("HULK SAY HELLO TO %s\n", uc_name);

    // find player
    for (size_t i = 0; hulk_players[i] != nullptr; ++i) {
        if (strcmp(hulk_players[i], uc_name) == 0) {
            printf("YOU GOOD ACTOR\n");
            return;
        }
    }
    printf("YOU NO PLAY HULK! HULK SMASH\n");
}

Exercise H1. What’s an argument to hulk_greeting that would cause no memory errors?

"Ruffalo"

Exercise H2. What’s an argument to hulk_greeting that would cause a stack buffer overflow?

"Ruffalo Ruffalo!" This 16-character string would fit in uc_name, but the terminating null character would overflow the stack buffer.

Exercise H3. What’s an argument to hulk_greeting that would cause an out-of-bounds read?

"Spencer" Any string whose upper-cased version isn’t in the hulk_players array will cause the “find player” loop to read out-of-bounds information, namely the nonexistent element hulk_players[4].

Exercise H4. What’s an argument to hulk_greeting that would cause a null pointer dereference?

nullptr

Exercise H5. What’s an argument to hulk_greeting that would cause a wild read?

For instance, (const char*) 183471981272498

Exercise H6. What’s an argument to hulk_greeting that would cause a use after free error?

Any recently-freed string memory.

Linked list examples

Now let’s turn our mind to something a little tougher: linked lists. Linked lists are sometimes considered simple data structures, but manipulating them is actually quite tricky, and if you’re not careful your linked list code may encounter memory errors.

Our linked list, which is doubly-linked, uses these declarations:

datareps2/ll.hh

struct node {
    int value;
    node* prev;
    node* next;
};

extern node* head;

For each node:

The next pointer points to the next node in the list, except that it is nullptr for the last element (the tail or back element).
The prev pointer points to the previous node in the list, except that it is nullptr for the first element (the head or front element).

Here’s an example function using this structure; it returns a pointer to the list node with value v, or nullptr if there’s no such node.

node* find(int v) {
    node* trav = head;
    while (trav && trav->value != v) {
        trav = trav->next;
    }
    return trav;
}

Exercise LL1. A data invariant is a statement about an object or data structure that is always true for any correct program. A function that modifies the data structure might break the invariant temporarily, but it should always restore the invariant before returning.

One data invariant for any list program using ll.hh would be:

Either head == nullptr or head points to a live node object.

What other data invariants should hold for list programs? Discuss! Ideally, you would enumerate a complete set of invariants, so that a program that preserves all the data invariants would never encounter a memory error involving lists.

Your invariants may use this definition: A node object n is a member of the list if and only if n is accessible by traversing next pointers starting at head.

We believe this set of invariants is complete:

If head != nullptr, then head->prev == nullptr.

If n is a member of the list, and n->prev != nullptr, then n->prev is a live object and a member of the list.

If n is a member of the list, and n->prev != nullptr, then n->prev->next == n.

If n is a member of the list, and n->next != nullptr, then n->next is a live object and a member of the list.

If n is a member of the list, and n->next != nullptr, then n->next->prev == n.

Given a set of data invariants, you can write functions whose only purpose is to check them, without modifying the relevant data structures or doing anything else. Such representation checker functions can be invaluable help in tracking down bugs in your code. For instance, if you’re having trouble with a data structure, try calling the representation checker at the end of every function that modifies the data structure. This will provide early warning as soon as the data structure goes wrong!

Offline Exercise. Write a representation checker function for the list.
It’s difficult to check the invariants about whether members of the list are live objects, but luckily many sanitizers will automatically check liveness as we traverse the list. Sanitizers don’t understand the list-specific invariants about next and prev pointers, though, so we must check them ourselves.
void check_list() {
    node* prev = nullptr;
    node* trav = head;
    while (trav) {
        assert(trav->prev == prev);
        prev = trav;
        trav = trav->next;
    }
}
More on invariant checkers

Exercise LL2. Write a function void append(node* n) that adds n to the end of the list whose head is head. In addition to adding n to the list, your function should set n->next and n->prev to the proper values. You may assume that n is a live node and not already a member of the list.

(We’ve posted reference linked-list code, but you’ll get more out of this exercise by working on the code yourself.)
Here’s one correct version:
void append(node* n) {
    node* prev = nullptr;
    node* trav = head;
    while (trav) {
        prev = trav;
        trav = trav->next;
    }
    n->prev = prev;
    n->next = nullptr;
    if (prev) {
        prev->next = n;
    } else {
        head = n;
    }
}
Hilariously our first version of this code was wrong. 🙄

Exercise LL3. Now swap your append function with a neighbor or friend and try to find memory bugs! Does their append function contain memory errors? If so, which ones?

Exercise LL4. Write a function void erase(node* n) that removes n from the list. You may assume that n is a live object and a member of the list. When erase returns, n should still be a live object, but no longer a member of the list.
Here’s one correct version:
void erase(node* n) {
    if (n->next) {
        n->next->prev = n->prev;
    }
    if (n->prev) {
        n->prev->next = n->next;
    } else {
        head = n->next;
    }
}

Exercise LL5. Now swap your erase function with a neighbor or friend and try to find memory bugs!

Exercise LL6. Can this linked list function violate any linked-list invariants or cause memory bugs (either immediately, or in a later call to erase or insert or find)? If so, how?

datareps2/ll6.cc
// Insert node `n` into the list immediately before node `before`.
// If `before == nullptr`, append `n` to the list.
void insert(node* before, node* n) {
    if (!before) {
        node* prev = head;
        while (prev->next != nullptr) {
            prev = prev->next;
        }
        n->prev = prev;
        n->next = nullptr;
        prev->next = n;
    } else {
        n->prev = before->prev;
        n->next = before;
        before->prev = n;
    }
}
This code has a number of problems!

If the list is empty, then line 10 will cause a null pointer dereference.

If before == head, then this code should make n the front of the list. But lines 17–19 do not update head!

If before != nullptr && before != head, then this code should update both before->prev->next and before->prev in lines 17–19, but it doesn’t update before->prev->next.

Error #1 will cause a memory error right away. Errors #2 and #3 will not, but they are still evil! Breaking a data structure leaves it in a wobbly, dangerous state, like a bridge with a crack in a stanchion; later operations can compound this problem, worsening the “crack” until a memory error occurs (and the bridge falls down). For instance, consider this sequence of operations using the provided insert and our suggested erase. The insert operation triggers error #3; the memory error happens later.
0. initial state

                       head == N1                 N2
                               prev: nullptr      prev: N1
                               next: N2           next: nullptr

1. insert(N2, N3);

                       head == N1                 N3                 N2
                               prev: nullptr      prev: N1           prev: N3
                               next: N2 XXX       next: N2           next: nullptr

2. erase(N1);

                               N1                 N3         head == N2
                               [not on]           prev: N1           prev: nullptr
                               [list]             next: N2           next: nullptr

3. // N1 can be freed now that it’s been removed from the list
   delete N1;

                                                  N3         head == N2
                                                  prev: N1 !!!       prev: nullptr
                                                  next: N2           next: nullptr


4. erase(N3);
   // This will try to modify N3->prev->next and cause a use-after-free error!
The list invariants hold in the initial state, state 0, but they fail starting after insert in step 1. (N1->next->prev != N1 and N3->prev->next != N3.) This error gets worse with more operations until a memory error happens in step 4. A good sanitizer might detect that memory error, but notice how far step 4 is from the insert bug that caused the problem! If you implemented a representation checker and checked the list representation at the end of insert, that would detect the error basically right after it happened, helping you localize the problem and therefore debug faster.