Datarep Section 2: Data representation exercises

In today’s section, we’ll walk through a selection of our data representation exercises, which are problems taken from prior exams. Also take the opportunity to ask questions about the problem set and class material.

College students are required to attend section live. TFs will record your attendance. For more on the attendance policy and what to do if you miss section refer to the syllabus here and here.

Extension students are welcome to attend section live (in-person or zoom) or watch the recording (Canvas > DCE Class Recordings). Both methods count towards your participation grade. If you attend a live section, TFs will record your attendance. If you watch the recording, fill out this reflection form.

All exercises assume a 64-bit x86-64 architecture unless otherwise stated.

The main subjects of these exercises are:

DATAREP-1. Sizes and alignments

QUESTION DATAREP-1A. True or false: For any non-array type X, the size of X (sizeof(X)) is greater than or equal to the alignment of type X (alignof(X)).

QUESTION DATAREP-1B. True or false: For any type T, the size of struct Y { T a; char newc; } is greater than the size of T.

QUESTION DATAREP-1C. True or false: For any types T1...Tn (with n ≥ 1), the size of struct Y is greater than the size of struct X, given:

struct X {
    T1 m1;
    ...
    Tn mn;
};
struct Y {
    T1 m1;
    ...
    Tn mn;
    char newc;
};

QUESTION DATAREP-1D. True or false: For any types T1...Tn (with n ≥ 1), the size of struct Y is greater than the size of union X, given:

union X {
    T1 m1;
    ...
    Tn mn;
};
struct Y {
    T1 m1;
    ...
    Tn mn;
};

QUESTION DATAREP-1E. Assume that structure struct Y { ... } contains K char members and M int members, with KM, and nothing else. Write an expression defining the maximum sizeof(struct Y).

QUESTION DATAREP-1F. Given struct Z { T1 a; T2 b; T3 c; }, which contains no padding, what does (sizeof(T1) + sizeof(T2) + sizeof(T3)) % alignof(struct Z) equal?

QUESTION DATAREP-1G. Arrange the following types in increasing order by size. Sample answer: “1 < 2 = 4 < 3” (choose this if #1 has smaller size than #2, which has equal size to #4, which has smaller size than #3).

  1. char
  2. struct minipoint { uint8_t x; uint8_t y; uint8_t z; }
  3. int
  4. unsigned short[1]
  5. char**
  6. double[0]

DATAREP-2. Expression matching

QUESTION DATAREP-2A. Consider the following eight expressions:

  1. m
  2. sizeof(&m)
  3. -1
  4. m & -m
  5. m + ~m + 1
  6. 32 >> 2
  7. m & ~m
  8. 1

For one unique value of m, those eight expressions can be grouped into four matched pairs where the two expressions in each pair have the same value, and expressions in different pairs have different values. What are the values of the expressions for that m?

DATAREP-38. Memory layout

The following code computes a Fibonacci number.

#include <cerrno>
#include <getopt.h>
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <unistd.h>

// copied from <cstdio> for your reference:
#define EOF (-1)

// copied from <getopt.h> for your reference:
extern int optind;

static void usage() {
    fprintf(stderr, "Usage: ./fib NUMBER\n");
    exit(1);
}

unsigned long fib(unsigned long n) {
    if (n < 2) {
        return n;
    } else {
        unsigned long tmp1 = fib(n - 1);
        unsigned long tmp2 = fib(n - 2);
        return tmp1 + tmp2;
    }
}

int main(int argc, char *argv[]) {
    // process command line options
    char ch;
    while ((ch = getopt(argc, argv, "h")) != EOF) {
        switch (ch) {
        case 'h':
        case '?':
        default:
            usage();
        }
    }

    // skip processed options
    argc -= optind;
    argv += optind;

    // require 1 argument
    if (argc != 1) {
        usage();
    }

    // parse argument
    unsigned long n = strtoul(strdup(argv[0]), nullptr, 10);
    if (n == 0 && errno == EINVAL) {
        usage();
    }

    // print fib(n)
    unsigned long f = fib(n);
    printf("fib(%lu) = %lu\n", n, f);
}

QUESTION DATAREP-38A. What are fib(0), fib(1), and fib(2)?

QUESTION DATAREP-38B. What are sizeof(fib(0)), sizeof(fib(1)), and sizeof(fib(2))?

For parts C–H, select from the list below the part of memory that best describes where you will find the object.

  1. In the heap
  2. In the stack
  3. Between the heap and the stack
  4. In a read-only data segment
  5. In a code segment
  6. In a read/write data segment

QUESTION DATAREP-38C. The string "fib(%lu) = %lu\n" (line 58).

QUESTION DATAREP-38D. optind (line 12)

QUESTION DATAREP-38E. tmp1 (line 23)

QUESTION DATAREP-38F. fib (lines 19-27)

QUESTION DATAREP-38G. strdup (line 51)

QUESTION DATAREP-38H. The string being passed to strtoul (line 51). (Read the manual page for strdup first!)

QUESTION DATAREP-38I. Does this program contain a memory leak?

QUESTION DATAREP-38J. Consider a call to fib(10) that was called from fib(11). Will fib(10)’s tmp1 variable have a larger, smaller, or equal address as fib(11)’s?

Kinds of memory error

Memory errors are common enough in C and C++ that some frequently-occurring classes of error have specific names.

  1. Invalid access: A read or write access that doesn’t access an object. Many kinds of invalid access have their own names, including:

    1. Null pointer dereference: A read or write access to a null pointer. (The null pointer never points to a live object.)
    2. Boundary read/write error: A read or write access immediately past the beginning or end of an object.
    3. Out-of-bounds access: A read or write access past the boundaries of an array.
    4. Stack buffer overflow: A write access past the end of an array with automatic lifetime (a local variable).
    5. Unaligned access: A read or write access to a pointer that’s not properly aligned for its type. (Every live object has a properly-aligned address.)
    6. Wild read/write: A read or write access to a pointer that’s nowhere near a live object.
    7. Aliasing violation: An access using a pointer that doesn’t correspond to the object type stored at that address (for instance, accessing a std::string using an int* pointer).
  2. Use-after-free: A read or write access to an object after its lifetime has ended. (This term is commonly applied when a program accesses a dynamically-allocated object after it was explicitly freed, but it is also possible to access a local object after it’s deallocated.)

  3. Invalid free: Passing a pointer to free (or delete) that was not allocated by malloc (or new).

  4. Double free: Freeing dynamically-allocated memory that has already been freed.

  5. Modify const: A write access to an object that is read-only (was originally declared const).

Invalid accesses and use-after-free errors violate the live object rule. Invalid and double frees violate the rules for dynamic memory allocation, but don’t violate the live object rules in and of themselves.

Sometimes a program that causes a memory error will crash right away; for instance, dereferencing a null pointer almost always crashes the program. This is good! We want errors to crash the program—a reliable crash is far easier to debug. Sometimes, though, the consequences of an error are more insidious. An error might cause no problem on some OSes, and cause a huge security hole in others. Sanitizers (make SAN=1) will catch many errors, but not all.

DATAREP-47. Hulk buffers

In the following questions, you’re asked to describe arguments to hulk_greeting that cause various kinds of memory error, including no memory error at all. You can find this program in cs61-sections/datareps2/hulk.cc.

const char* hulk_players[] = {
    "RUFFALO",
    "BANA",
    "FERRIGNO",
    "MIYAUCHI"
};

void hulk_greeting(const char* name) {
    char uc_name[16];

    // make uppercase version of `name`
    for (size_t i = 0; name[i] != 0; ++i) {
        uc_name[i] = toupper(name[i]);
    }
    uc_name[strlen(name)] = 0;
    printf("HULK SAY HELLO TO %s\n", uc_name);

    // find player
    for (size_t i = 0; hulk_players[i] != nullptr; ++i) {
        if (strcmp(hulk_players[i], uc_name) == 0) {
            printf("YOU GOOD ACTOR\n");
            return;
        }
    }
    printf("YOU NO PLAY HULK! HULK SMASH\n");
}

QUESTION DATAREP-47A. What’s an argument to hulk_greeting that would cause no memory errors?

QUESTION DATAREP-47B. What’s an argument to hulk_greeting that would cause a stack buffer overflow?

QUESTION DATAREP-47C. What’s an argument to hulk_greeting that would cause an out-of-bounds read?

QUESTION DATAREP-47D. What’s an argument to hulk_greeting that would cause a null pointer dereference?

QUESTION DATAREP-47E. What’s an argument to hulk_greeting that would cause a wild read?

QUESTION DATAREP-47F. What’s an argument to hulk_greeting that would cause a use after free error?

DATAREP-11. Dynamic memory allocation

QUESTION DATAREP-11A. True or false?

  1. free(nullptr) is an error.
  2. malloc(0) can never return nullptr.

QUESTION DATAREP-11B. Give values for sz and nmemb so that calloc(sz, nmemb) will always return nullptr (on a 64-bit x86-64 machine), but malloc(sz * nmemb) might or might not return null.

For OFFLINE parts C–F, consider the following 8 statements. (p and q have type char* and start out as uninitialized variables.)

  1. free(p);
  2. free(q);
  3. p = q;
  4. q = nullptr;
  5. p = (char*) malloc(12);
  6. q = (char*) malloc(8);
  7. p[8] = 0;
  8. q[4] = 0;

This tool will check your answers for parts C–F, but you should work out an answer to your satisfaction before checking it.

OFFLINE QUESTION DATAREP-11C. Put the statements in an order that would execute without error or evoking undefined behavior. Memory leaks count as errors. Use each statement exactly once. Sample answer: “abcdefgh.”

OFFLINE QUESTION DATAREP-11D. Put the statements in an order that would cause one double-free error, and no other error or undefined behavior (except possibly one memory leak). Use each statement exactly once.

OFFLINE QUESTION DATAREP-11E. Put the statements in an order that would cause one memory leak (one allocated piece of memory is not freed), and no other error or undefined behavior. Use each statement exactly once.

OFFLINE QUESTION DATAREP-11F. Put the statements in an order that would cause one boundary write error, and no other error or undefined behavior. Use each statement exactly once.