C and C++ Patterns – CS 61 2021

Cast to void to silence warnings
Sized integers
Linked list operations
Iterator traversal and erase
Doubling for efficient growable arrays

Cast to void to silence warnings

Goal: Prevent compiler warnings about “unused variables.”

Problem: Your programs should compile without warnings, but sometimes unused variable warnings happen even when a variable is intentionally unused.

Key idea: Write the variable’s name on a line by itself. Cast the variable to void and add a comment to help the next reader.

Rationale: Our handout code asks the compiler to generate as many warnings as possible. Compiler warnings usually indicate problems, and for good hygiene, you should aim to eliminate all warnings in your own code. However, sometimes the compiler needs a little help, or it will generate too many warnings. A common example is where you’re writing a function that takes more arguments than it currently uses. For instance:

void* m61_malloc(size_t sz, const char* file, int line) {
    return malloc(sz);
}

Eventually, this code will use all of its arguments (to report error messages); but in the meantime we’ll see these warnings:

x.c: In function ‘m61_malloc’:
x.c:6:41: warning: unused parameter ‘file’ [-Wunused-parameter]
x.c:6:51: warning: unused parameter ‘line’ [-Wunused-parameter]

These warnings are annoying and the compiler’s chatter may help the user miss more relevant warnings. So we advise that you silence the warnings. Tell the compiler these variables are meant to be unused. Here’s how:

void* m61_malloc(size_t sz, const char* file, int line) {
    (void) file, (void) line; // silence warnings
    return malloc(sz);
}

The (void) casts and the comment are for human readers. Casting an expression to void explicitly throws away that expression’s value. This tells readers that the programmer intended to write a do-nothing expression. Otherwise a reader might think it was a mistake.

Sized integers

Goal: Use integer types that are suited for specific purposes.

Key idea: Standard names for important integer types are provided by #include <cstdint>. These names include:

uint8_t: Unsigned integer type containing exactly 8 bits. Equals unsigned char on every machine we care about.
int8_t: Signed integer type containing exactly 8 bits. Equals signed char on every machine we care about.
uint16_t, int16_t: Same for 16-bit integers. Usually equals unsigned short and short.
uint32_t, int32_t: Same for 32-bit integers.
uint64_t, int64_t: Same for 64-bit integers.
uintptr_t, intptr_t: Integer types with the same size as pointer types. On machines with 64-bit addresses, such as x86-64, uintptr_t is the same type as uint64_t. On machines with 32-bit addresses, such as x86-32, it is uint32_t.
size_t: Unsigned integer type used to represent sizes of objects that can fit in memory. This is usually the same type as uintptr_t. Every sizeof(x) expression returns a value of type size_t.
off_t: Unsigned integer type used to represent file sizes and positions within files. This might be bigger than size_t, since most machines can manipulate disk files that are larger than memory.
ssize_t: Signed integer type with the same width as size_t.
ptrdiff_t: Pointer subtraction expressions, such as ptr1 - ptr2, have this type. It is generally the same as intptr_t.

Linked list operations

Goal: Manage a linked list: an ordered set of items with efficient O(1) insertion and deletion operations, but not-necessarily-efficient search (it’s acceptable for searching a list of N items to take O(N) time).

Problem: Linked lists are difficult to get right!

Key idea: Here provide several variants of linked list insert and erase operations, coded in C++. Don’t understand one? Draw pictures!

Terminology: A node is an item on a list.

The head of a list is the first item in the list. If the list is empty, its head is nullptr.

The tail of a list is the last item in the list. If the list is empty, its tail is nullptr.

Moving from one item to another is called traversing the list. A local variable used for traversal is often called trav.

Next pointers are used to traverse the list from the head towards the tail. Prev or previous pointers are used to traverse the list from the tail towards the head. Next pointers are more fundamental: singly-linked lists have next pointers, but not prev pointers.

Variant 1: Doubly-linked list with head

O(1) operations supported: insert at head, remove

struct node {
    node* next;
    node* prev;
    // ... other data ...
};

struct list {
    node* head = nullptr;
};

/** Insert `n` at the beginning of `l`. */
void list_push_front(list* l, node* n) {
    n->next = l->head;
    n->prev = nullptr;
    if (l->head) {
        l->head->prev = n;
    }
    l->head = n;
}

/** Remove `n` from `l`. */
void list_erase(list* l, node* n) {
    if (n->next) {
        n->next->prev = n->prev;
    }
    if (n->prev) {
        n->prev->next = n->next;
    } else {
        l->head = n->next;
    }
}

Variant 2: Doubly-linked list with head and tail

O(1) operations supported: insert at head, insert at tail, remove

struct node {
    node* next;
    node* prev;
    // ... other data ...
};

struct list {
    node* head = nullptr;
    node* tail = nullptr;
};

void list_push_front(list* l, node* n) {
    n->next = l->head;
    n->prev = nullptr;
    if (l->head) {
        l->head->prev = n;
    } else {
        l->tail = n;
    }
    l->head = n;
}

void list_push_back(list* l, node* n) {
    n->next = nullptr;
    n->prev = l->tail;
    if (l->tail) {
        l->tail->next = n;
    } else {
        l->head = n;
    }
    l->tail = n;
}

void list_erase(list* l, node* n) {
    if (n->next) {
        n->next->prev = n->prev;
    } else {
        l->tail = n->prev;
    }
    if (n->prev) {
        n->prev->next = n->next;
    } else {
        l->head = n->next;
    }
}

Iterator traversal and erase

Goal: Traverse a container using iterators while removing some items.

Problem: Removing the element referenced by an iterator makes that iterator invalid.

Key idea: If an element is not removed during traversal, increment the iterator. However, if an element is removed, do not increment; instead, use the return value from container.erase. (C++ container erase(iterator) methods return an iterator referencing the first element after the removed element or elements.)

// Remove all elements of `map` whose values are even.
void remove_even_items(std::map<std::string, int>& map) {
    for (auto it = map.begin(); it != map.end(); ) {
        if (it->second % 2 == 0) {
            // remove item
            it = map.erase(it);
        } else {
            // do not remove item
            ++it;
        }
    }
}

Vector doubling for efficient growable arrays

Goal: Manage a growable, dynamically-allocated array. A sequence of array operations should use at most O(log M) allocations, where M is the maximum size of the array.

Problem: Naive strategies for growing an array, such as adding 1000 items at a time, have linear overhead: over time, each item in an array with N items is copied O(N) times and O(N) allocations are performed.

Key idea: Maintain both a size and a capacity. The size represents the number of items currently in the array, while the capacity represents the space available for elements. The size must always be less than or equal to the capacity. To add an element to the array, check the size against the capacity. If there is room for the new element (size < capacity), simply add the new element; otherwise (size == capacity), double the capacity first.

C++’s std::vector<T> type uses this pattern to manage its internal storage, and in C++, you should generally use std::vector rather than write your own. It’s still good to know that this doubling strategy reduces the overhead of space allocation to logarithmic.

struct vector_of_T {
    T* data = nullptr;
    size_t size = 0;
    size_t capacity = 0;
};

/** Access element 'i' of the array. */
T* vector_get(vector_of_T* v, size_t i) {
    assert(i < v->size);
    return &v->data[i];
}

/** Add a copy of `new_element` to the end of the array. */
void vector_push_back(vector_of_T* v, T new_element) {
    if (v->size == v->capacity) {
        size_t new_capacity;
        // Double the capacity, or create an initial capacity of 8
        if (v->capacity == 0) {
            new_capacity = 8;
        } else {
            new_capacity = v->capacity * 2;
        }
        // Or shorter: "new_capacity = v->capacity ? v->capacity * 2 : 8;"

        v->data = (T*) realloc(v->data, sizeof(T) * new_capacity);
        v->capacity = new_capacity;
    }

    v->data[v->size] = new_element;
    ++v->size;
}