Test 1 – CS 61

Show all solutions Hide all solutions

Rules

The exam is open book, open note, limited open computer. You may access the textbook and your own notes. You may also use computers or electronic devices to access your own class materials and public class materials. Specifically:

You may access your own notes and problem set code electronically.
You may access Internet sites on which your own notes and problem set code are stored.
You may access a browser and a PDF reader.
You may access the course site.
You may access pages directly linked from the course site, including our lecture notes, exercises, and section notes, and reference material, such as cppreference.com.
You may access our lecture, problem set, and section code.
You may run a C++ compiler, including an assembler and linker.
You may access manual pages and common utilities, including calculators and a Python interpreter.

However:

You may not access class discussion forums or other question forums such as Stack Overflow.
You may not access an on-line disassembler or compiler explorer.
You may not access solutions from any previous exam, by paper or computer, except for those on the course site.
You may not broadly search the Internet for answers to questions or access other courses’ materials. Stick to our course site.
You may not interact with AI models (e.g., ask ChatGPT a question).
You absolutely may not contact other humans with questions about the exam—whether in person, via IM, or by posting questions to forums—with the exception of course staff.

Any violations of this policy are breaches of academic honesty and will be treated accordingly. Please appreciate our flexibility and behave honestly and honorably.

Additionally, students are taking the exam at different times. Do not post publicly about the exam until given permission to do so by course staff. This includes general comments about exam difficulty.

Questions

If you have questions during the exam, write us email at cs61-staff@g.harvard.edu. Don’t use the Edboard. We may not notice Edboard posts, and you aren’t supposed to access the Edboard anyway.

Completing your exam

You have 90 minutes to complete the exam starting from the time you press Start. Different students may have different exams. Enter your answers directly in this form. A green checkmark appears when an answer is saved.

Students with extra time accommodations: If your extra time is not reflected next to the start button, contact course staff and wait for confirmation before pressing it.

You should not need to attach a drawing or other longer notes, but you may do so by adding them to a test1 subdirectory in your cs61-f24-psets repository. Make sure you push your changes to GitHub before time is up, and explain in this form where we should look at your notes.

There is no need to explicitly “submit” the exam. Your changes are automatically saved with timestamps. We’ll grade the latest version saved within your exam window.

Notes

Assume a Linux operating system running on the x86-64 architecture unless otherwise stated. If you get stuck, move on to the next question. If you’re confused, explain your thinking briefly for potential partial credit.

Some figures and code samples can be pinned to your browser so you can keep them visible as you scroll through the test. Push the 📌 and then drag the pinned figure where you want it.

1. Our governing body (20 points)

The Harvard Corporation has attempted to complete pset 1. Their implementation has a lot of problems, but they refuse to admit they’ve done anything wrong. Your job is to prove that they have problems by attacking their implementation.

QUESTION 1A. This code, from m61_calloc, intends to detect integer overflow. Write a call to m61_calloc that causes integer overflow at the marked line, but does not trigger the check. (We’re omitting file and line parameters.)

void* m61_calloc(size_t count, size_t sz) {
    if (count * sz > default_buffer.size) { // check for overflow
        return nullptr;
    }
    void* ptr = m61_malloc(count * sz);
    ...

QUESTION 1B. They tried again with this version. Write a call to m61_calloc that causes undefined behavior with this version.

void* m61_calloc(size_t count, size_t sz) {
    if (count > default_buffer.size / sz) { // check for overflow
        return nullptr;
    }
    void* ptr = m61_malloc(count * sz);
    ...

QUESTION 1C. This code, from m61_malloc and m61_free, intends to account for active allocations. (This is from Phase 1 of the pset, before memory reuse.) Write a sequence of calls to m61_malloc and/or m61_free that cause num_active_alloc to become negative, which should be impossible.

long num_active_alloc = 0;
long num_total_alloc = 0;

void* m61_malloc(size_t sz) {
    if (sz == 0) {
        return nullptr;
    }
    ...
    ++num_active_alloc;
    ++num_total_alloc;
    return ptr;
}

void* m61_free(void* ptr) {
    --num_active_alloc; // not reusing memory, so nothing else to do
}

QUESTION 1D. This code, from m61_malloc, attempts to account for alignment requirements.

void* m61_malloc(size_t sz) {
    if (default_buffer.pos + sz > default_buffer.size) {
        return nullptr;
    }
    if (default_buffer.pos % 16 != 0) {
        default_buffer.pos += default_buffer.pos % 16; // align allocation
    }
    void* ptr = &default_buffer.buffer[default_buffer.pos];
    default_buffer.pos += sz;
    return ptr;
}

But the // align allocation line is wrong. What should it be instead?

QUESTION 1E. Given Question 1D’s uncorrected code, write a sequence of m61_malloc calls where the last call will return a nonnull pointer outside the bounds of default_buffer. (Recall that default_buffer.size == 0x800000.)

2. An investigation (15 points)

These questions concern an investigation into data representation for standard C++ data structures, similar to the one we performed in Lecture 3. A set of objects are read out of standard input and inserted, in sorted order, into a data structure a of type CONTAINER<T>. Then the following loop is used to print the contents of the data structure:

CONTAINER<T> a;
for (auto& x : a) {
    print_object(x);
    printf("\n");
}

This prints the data representations of the contents of the container, starting from the first element and going to the last. Elements are separated by a blank line.

You will use the output of that loop to guess the likely values for CONTAINER and T. CONTAINER is always either std::vector or std::list, but T might be any basic type or well-known C++ library type.

QUESTION 2A. Give likely CONTAINER (vector or list) and T for this output.

7208'0000'0020  01 00 00 00

7208'0000'0024  02 00 00 00

7208'0000'0028  03 00 00 00

7208'0000'002c  04 00 00 00

QUESTION 2B. Give likely CONTAINER and T for this output.

7204'0000'0020  31

7204'0000'0021  32

7204'0000'0022  33

QUESTION 2C. Give likely CONTAINER and T for this output.

7208'0000'0070  31

7208'0000'0090  32

7208'0000'0050  33

7208'0000'00b0  34

QUESTION 2D. Give likely CONTAINER and T for this output.

720c'0000'0040  50 00 00 00 0c 72 00 00  0a 00 00 00 00 00 00 00
720c'0000'0050  43 53 36 31 2d 52 6f 63  6b 73 00 00 00 00 00 00

720c'0000'0010  20 00 00 00 0c 72 00 00  05 00 00 00 00 00 00 00
720c'0000'0020  48 65 6c 6c 6f 00 00 00  00 00 00 00 00 00 00 00

720c'0000'0070  90 00 00 00 0c 72 00 00  23 00 00 00 00 00 00 00
720c'0000'0080  23 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00

QUESTION 2E. Give likely CONTAINER and T for this output.

7210'0000'0000  01 00 00 00 00 00 00 00

7210'0000'0008  02 00 00 00 00 00 00 00

7210'0000'0010  03 00 00 00 00 00 00 00

7210'0000'0018  04 00 00 00 00 00 00 00

7210'0000'0020  05 00 00 00 00 00 00 00

3. Abstract machine (20 points)

We have studied both properties of the C++ abstract machine, which should be true across all implementations, and properties of a typical implementation (x86-64 Linux). For each of the following statements, say whether it is true for any implementation of the C++ abstract machine; true on the x86-64 implementations we’ve studied; or not true. In your answer, write “abstract machine”, “conventional x86-64”, or “false”.

QUESTION 3A. sizeof(char) == 1.

QUESTION 3B. For any type T, and given an array T arr[1], sizeof(arr) == sizeof(T).

QUESTION 3C. For any struct type T, sizeof(T) > 1.

QUESTION 3D. The stack grows down, so if main calls function f, then the addresses of f’s local variables are smaller than the addresses of main’s local variables.

QUESTION 3E. The address of a dynamically-allocated object (such as a new int) is larger than the address of any global variable.

QUESTION 3F. The address of a dynamically-allocated object is different from the address of any global variable.

QUESTION 3G. Every dynamically-allocated object has a unique address.

QUESTION 3H. For a simple struct object with components T1 x1; T2 x2; ...; Tn xn;, the address of the xn component is always greater than the address of the x1 component.

QUESTION 3I. For a function with local variables T1 x1; T2 x2; ...; Tn xn;, the address of the xn object is always greater than the address of the x1 object.

QUESTION 3J. The amount of padding added at the end of a struct for alignment purposes is always less than 16 bytes.

2pts. Conventional x86-64. The C++ abstract machine does have alignment, but the fact that it is always less than 16 bytes is defined by the x86-64 Conventions. The abstract machine says that the size must be a multiple of the alignment, but doesn't specify upper bounds on what that alignment is.

We can imagine a system where:
struct foo {
    T x; // T has alignment 32
    char y;
    // we will need 31 bytes of alignment here
};
Whereas the system we studied in class (Conventional X86-64) has std::max_align_t == 16, so we can conclude there is never a need to add at least 16 bytes of padding.

4. Assembly (20 points)

Here is some assembly produced by a compiler for a C++ function named f. The function involves traversing an array.

_Z1f....:
    xor    %eax,%eax
    test   %rsi,%rsi
    je     L2
L1:
    dec    %rsi
    lea    (%rdi,%rsi,4),%rdx
    xor    %ecx,%ecx
    mov    (%rdx),%ecx
    cmp    $0x5,%ecx
    jae    L1
    add    %ecx,%eax
    test   %rsi,%rsi
    jne    L1
L2:
    ret

QUESTION 4A. What is the most likely number of parameters in the C++ declaration for function f? Explain briefly.

QUESTION 4B. What are the likely type(s) of the parameter(s)?

QUESTION 4C. What is the likely type of the return value?

QUESTION 4D. Give argument values for which the function returns 0.

QUESTION 4E. Give argument values for which the source C++ function would always execute undefined behavior.

QUESTION 4F. List an instruction in the function definition that is redundant (i.e., could be removed without changing function behavior).

QUESTION 4G. Does the function traverse the array “forwards” (from index 0 up) or “backwards” (from index N down)?

QUESTION 4H. Write C++ code that could have compiled into the function.

4pts.
unsigned f(unsigned* x, size_t n) {
    unsigned sum = 0;
    while (n > 0) {
        --n;
        if (x[n] < 5) {
            sum += x[n];
        }
    }
    return sum;
}
We loop through the array backwards, find the address of each element using the index, dereference the address, and add the value to %eax (which starts at 0) if the element is less than 5, and return %eax at the end.

QUESTION 4I. Under what conditions might f’s C++ declaration have had more than the most likely number of parameters? List all that apply; explain briefly if unsure.

The function definition did not use all of the declared parameters.
The compiler determined that the function code using the surplus parameters had no observable effect.
The compiler determined that the function code using the surplus parameters was only invoked if earlier code caused undefined behavior.
Multiple function parameters were packed into single registers for the purpose of argument passing.
None of the above.

QUESTION 4J. Under what conditions might f’s C++ declaration have had fewer than the most likely number of parameters? List all that apply; explain briefly if unsure.

At least one function parameter was a reference argument.
At least one function parameter had size larger than 8 bytes.
The compiler determined that callers of the function were passing more arguments than the declaration had parameters.
None of the above.

2pts. Only 2. If 1 were true, said parameter would likely be passed in as a pointer (and still appear in the assembly). 2 is possible. If we have:
struct foo_t {
    long x;
    long y;
};
void f(foo_t bar);
Then it is possible that f will simply split the components inside the struct into two registers, despite taking in only one argument. If 3 were true, the compiler would throw an error. 4 is false because 2 is true.

5. Assembly matching (20 points)

The following eight assembly functions each take at least one argument and return a result. The functions can be paired so that the two elements of each pair always produce identical results given identical parameters. Your job is to find the matching pairs.

One of these functions uses an instruction you haven’t seen before: rorq, or rotate right. Like shrq, the bitwise shift-right instruction, rorq SRC, DST shifts the bits in DST right by SRC places; but then rorq takes the bits that were shifted off and moves them into the vacated more-significant places in DST. Here’s a C++ implementation of rorq:

unsigned long rorq(unsigned src, unsigned long dst) {
    for (unsigned i = 0; i < src; ++i) {
        unsigned long low_bit = dst & 1;
        dst = (dst >> 1) | (low_bit << 63);  // rotate by one place
    }
    return dst;
}

The functions:

```
f0:
    xorl %eax, %eax
    ret
```

f1:
    imulq $2, %rsi
    leaq (%rdi,%rsi), %rax
    movw (%rax), %ax
    movzwl %ax, %eax
    retq

f2:
    movq $0, %rax
    movw (%rdi,%rsi,2), %ax
    retq

f3:
    movq %rdi, -8(%rsp)
    testq %rsi, %rsi
    jge L4
    movl %edi, -4(%rsp)
    shrq $32, %rdi
    movl %edi, -8(%rsp)
L4:
    movq -8(%rsp), %rax
    retq

f4:
    movq %rdi, %rdx
    addq %rdi, %rdx
    addq %rdi, %rdx
    addq %rdi, %rdx
    leaq -1(%rdx), %rax
    ret

f5:
    salq $0x2, %rdi
    movq %rdi, %rax
    decq %rax
    ret

f6:
    movq %rsi, %rcx
    cmpq $0, %rcx
    sets %cl
    negl %ecx
    andl $0x20, %ecx
    rorq %cl, %rdi
    movq %rdi, %rax
    retq

f7:
L1:
    testq %rax, %rax
    je L2
    decq %rax
    jmp L1
L2:
    ret

QUESTION 5A. Name the four matching pairs, or as many of them as you can find.

QUESTION 5B. Which function(s) access the red zone?

QUESTION 5C. Which function(s) contain a loop?

QUESTION 5D. Which function(s) might access objects in a caller’s local variables, and under what circumstances?

6. The end (5 points)

QUESTION 6A. What topics are you interested in spending more time on? Any answer but no answer will receive full credit.

2024 CS 61 Test 1