CS 61
  • Info
    Concepts Course Description Course Staff Extension Resources Schedule Setup: GitHub Setup: Docker Setup: VM Style C and C++ Patterns Git Diff GDB Introduction GDB Commands File Descriptors HTTP Test Information Test 1 Test 2 Final Midterm 2023 Final 2023 Midterm 2022 Final 2022 Exercises: Miscellaneous
  • Problem sets
    Problem set 0 Problem set 1 Problem set 2 Problem set 3 Problem set 4 Problem set 5 Problem set 5 EC: Interruption Problem set 6 Problem set 6 EC Data representation exercises Assembly exercises Kernel exercises Storage exercises Process control exercises Synchronization exercises
  • Lectures
    Textbook Data representation Data representation 1: Introduction Data representation 2: Sizes and layout Data representation 3: Layout and dynamic allocation Data representation 4: Dynamic allocation, invariants Assembly Kernel Kernel 1: Processes, isolation, robustness Kernel 2: Virtual memory Kernel 3: Virtual memory and page tables Storage Storage 1: OS input/output, memory hierarchy Storage 2: Cache optimizations and coherence Storage 3: Eviction policies and coherence Process control Processes 1: Basics Processes 2: Inter-process communication Processes 3: Interruption and race conditions EthiCS: Harms and Unicode EthiCS: UTF-8 Synchronization Synchronization 1: Threads and atomicity Synchronization 2: Critical sections, serializability, lock granularity Synchronization 5: Databases Synchronization 6: Databases and Experiences
  • Sections
    Times & locations Datarep Section 1: C++ data structures Datarep Section 2: Memory bugs Assembly Section 1: Fun Datarep/Assembly Exercises Section Kernel Section 1: WeensyOS Kernel Section 2: Kernel exercises Kernel Section 3: Pipes Storage Section 1: Single-slot cache Process Section 1 and Storage Exercises Process Supplement: BNF grammars Process Section 2: Process control exercises Synchronization Section 1: Threads and synchronization objects Synchronization Section 2: Synchronization Exercises

2023 Final

Show all solutions Hide all solutions

Rules

The exam is open book, open note, open computer. You may access the book and your own notes. You may also use computers or electronic devices to access your own class materials and public class materials. Specifically:

  • You may access a browser and a PDF reader.
  • You may access your own notes and problem set code electronically.
  • You may access Internet sites on which your own notes and problem set code are stored.
  • You may access the course site.
  • You may access pages directly linked from the course site, including our lecture notes, exercises, and section notes.
  • You may access any web pages referenced by the "Notes" section of the lecture PowerPoint slides.
  • You may access our lecture, problem set, and section code.
  • You may run a C or C++ compiler, including an assembler and linker.
  • You may access manual pages and common utilities, including calculators and a Python interpreter.

However:

  • You may not access class discussion forums, such as the Edboard, or other question forums, such as Stack Overflow.
  • You may not access an on-line disassembler or decompiler.
  • You may not access solutions from any previous exam, by paper or computer, except for those on the course site.
  • You may not broadly search the Internet for answers to questions or access other courses’ materials. Stick to our course site and web pages referenced by the lecture PowerPoint slides.
  • You absolutely may not contact other humans or AI assistants with questions about the exam—whether in person, via IM, via GitHub Copilot, or by posting questions to forums. The single exception is that you may contact course staff.

Any violations of this policy are breaches of academic honesty and will be treated accordingly. Please appreciate our flexibility and behave honestly and honorably.

Students are taking the exam at different times. Do not post publicly about the exam until given permission to do so by course staff. This includes general comments about exam difficulty.

Questions

If you have questions during the exam, raise your hand if you are taking the exam in person. Otherwise, write us email at cs61-staff@g.harvard.edu. Don’t use the Edboard. We may not notice Edboard posts, and you aren’t supposed to access the Edboard anyway.

Completing your exam

You have 3 hours (180 minutes) to complete the exam starting from the time you press Start. Different students may have different exams. Enter your answers directly in this form. A green checkmark appears when an answer is saved. (Click outside a text box to save the answer.)

Students with extra time accommodations: If your extra time is not reflected next to the start button, contact course staff and wait for confirmation before pressing it.

There is no need to explicitly “submit” the exam. Your changes are automatically saved with timestamps on the grading server. We’ll grade the latest version saved within your exam window. If you experience bad network connectivity while taking the exam, be sure to save a copy of your answers locally; after the exam, immediately email those answers to cs61-staff@g.harvard.edu. Course staff will also have physical copies of the exam in case you cannot access the official version hosted on the grading server.

Notes

The exam contains 14 questions in total (Q1A—B, Q2A—D, Q3A, Q4A—C, and Q5A—D). Assume a Linux operating system running on the x86-64 architecture unless otherwise stated. If you get stuck, move on to the next question. If you’re confused, explain your thinking briefly for potential partial credit.

1. Compiling, Assembling, and Linking (12 points)

QUESTION 1A. Suppose that a single program will launch four different threads. Further suppose that each thread has the same entry point f(), i.e., for each thread, the function f is passed to the constructor for std::thread. If we use a tool like objdump to look at the machine instructions in the program's executable binary, will we see four copies of the instructions in f (i.e., one per thread), or one copy of the instructions in f, or something else? Explain your answer.

6pt Just one copy. Threads execute in the same address space and thus share the same code region in that address space. So, there will be just one copy of f's code in the executable binary.

QUESTION 1B. In the program from above, suppose that we continue to use objdump to explore the executable binary. Will objdump be able to find a section in the binary which contains the program's heap? Why or why not?

6pt No. The heap is a runtime aspect of the program, and thus is not embedded in the executable. The executable only contains the code and static data for the program; this content is always needed to initialize the runtime state for the program.

2. Kernels (28 points)

QUESTION 2A. In Lecture 9, we looked at how system calls worked, examining how a kernel must configure an x86-64 chip to properly handle syscall instructions. Suppose that, during the kernel's boot, the kernel sets %IA32_LSTAR to the boot-time %rsp instead of to reinterpret_cast<uint64_t>(syscall_entry). Why is this so catastrophic? Explain what will happen during the first time that user-level code invokes a syscall instruction.

8pt The CPU will jump to whatever address happened to be the top of the stack during kernel boot time! This location does not contain the code for the intended system call handler, but instead contains random bytes that, when interpreted as instructions, will lead to unpredictable CPU behavior. [A more detailed answer might say that, instead of the behavior being unpredictable, the behavior will definitely cause a page fault because the kernel will mark the page table entry for the kernel stack as being non-executable (NX). However, NX protections need not be mentioned to receive full credit.]

QUESTION 2B. In WeensyOS, why is the kernel stack empty when the CPU is executing user-level code?

6pt A stack is empty when there is no active computation occurring for the associated entity. When user-level code is executing in WeensyOS, the kernel has no active computation occurring, thus the kernel's stack is empty.

QUESTION 2C. Recall that, on an x86 chip, assigning to %cr3 is a privileged operation. Further recall that, on x86 chips, each entry in the interrupt descriptor table (IDT) contains the virtual address for the handler function for a particular interrupt or exception.

Suppose that a buggy WeensyOS kernel accidentally sets the virtual address in an IDT entry to be an address for a function in user memory (not kernel memory). When the relevant interrupt or exception occurs and that user-level function executes, would that function be able to set %cr3 to point to the kernel's page table (allowing the user-level function to access all memory)? Why or why not? You can assume that the user-level function does indeed contain an instruction that would try to set %cr3 to point to the kernel's page table.

8pt Yes. Assigning to %cr3 is a privileged operation, but when an exception or interrupt occurs, the CPU automatically flips the CPU's mode from "unprivileged" to "privileged" (if the mode wasn't already "privileged"), meaning that the user-mode function would be allowed to assign to %cr3.

QUESTION 2D. On a multi-core machine, why does each core need its own TLB? In your answer, describe what could go wrong if each core shared a single TLB.

6pt Each core has a unique %cr3 register and thus potentially may be using a page table that is not simultaneously in use by any other core. So, each core may have unique virtual-to-physical mappings, meaning that each core needs its own TLB. If two cores shared the same TLB but had different page tables, then the two page tables may try to add different mappings for the same virtual address, with one mapping pointing to physical page P0 and another to P1. Such a mapping will be wrong if used by one of the cores!

3. Unicode (7 points)

QUESTION 3A. For an arbitrary string of characters in an arbitrary human language, is the UTF-32 encoding of that string guaranteed to always be larger than the UTF-8 encoding? Why?

7pt No. Some characters in UTF-8 are encoded using 4 bytes. If the string to translate only contained such characters, then the UTF-8 encoding of the string would be the same size as the UTF-32 encoding!

4. Storage (21 points)

QUESTION 4A. Consider the following C++ code:

#define NUM_ELEMENTS 32768
#define NUM_ITERATIONS 1000000

unsigned int arr[NUM_ELEMENTS];
unsigned int sum = 0;
for (unsigned int i = 0; i < NUM_ITERATIONS; ++i) {
    unsigned int index = (i * 8192) % NUM_ELEMENTS;
    arr[index] = i + sum;
    sum += i;
}

Assume that the variables sum, i, and index live in registers, whereas elements in arr live in memory. Do the memory accesses to arr exhibit temporal locality, spatial locality, both, or neither? Why?

7pt The memory references have good temporal locality but poor spatial locality: the same array elements are accessed repeatedly, but those elements are far apart.

QUESTION 4B. Consider a one-block cache that has an implementation bug. Suppose that block b is currently cached and is clean. Further suppose that sometimes, when the application requests block b, the cache doesn't return the cached block, but instead performs a read of the slow underlying storage, evicting the cache block and replacing it with the (equivalent!) one that was just fetched from slow storage. Is this buggy cache still a coherent cache? Why?

6pt Yes, because the cache still returns what a direct read from storage would return.

QUESTION 4C. Suppose that a single-threaded process X is performing memory-mapped IO (both reads and writes) on "foo.txt", whereas a single-threaded process Y is performing memory-mapped IO (both reads and writes) on "bar.txt". In what situations (if any) would there be memory-based race conditions on buffer cache pages?

8pt No race conditions would be possible. Each file has a set of memory pages in the kernel's buffer cache, none of which are shared with the other file. So, X will not access memory pages being accessed by Y, and vice versa, so there are no races across processes. Also, each process is single-threaded, so there are no races that can be generated within the threads belonging to a single process.

5. Processes, Threads, and Synchronization (32 points)

QUESTION 5A. Consider a single-threaded process that receives a signal. Is it possible for the signal handler to run in parallel with the normal application code in the process, such that, on a two-core machine, the signal handler and the normal application code would execute simultaneously? Why?

9pt No, this isn't possible. When the kernel context-switches to the signaled process, the kernel hijacks the process's normal execution flow, preventing the normal application context from being restored, and instead setting up the context of the signal handler. The normal application context won't be restored until the signal handler has finished. So, at any given time, either the normal application code is running, exclusive-or the signal handler is running.

QUESTION 5B. Suppose that, for each process on a Linux machine, you can inspect the process's virtual memory area (VMA) data structure maintained by Linux; however, you can't inspect the process's page table. Using only knowledge in the VMAs, could you determine how many pages of physical RAM are being used by all processes? Why?

7pt No. VMAs do not contain virtual-to-physical mappings, so just by looking at VMAs, you couldn't detect aliasing relationship in which different virtual pages (in either the same process or different processes) are referring to the same physical page.

QUESTION 5C. Consider a multi-threaded process that contains two ticket locks T1 and T2. Assume that the locks use the implementation from Lecture 23. Explain how ticket locks intrinsically prevent deadlock, or give an example of how deadlock could nonetheless arise with T1 and T2 (and explain why ticket locks don't prevent the deadlock).

7pt Deadlock can still arise! Thread X grabs T1 at the same time that thread Y grabs T2, and then X blocks trying to grab T2 while Y blocks trying to grab T1. Deadlock will occur because the fact that each individual lock has a fair acquisition order does not imply that there is a global order over all locks. [Note that some students may give a simpler answer---namely, that a thread can deadlock itself (without assistance from another thread) simply by trying to acquire the same lock twice. This answer receives full credit.]

QUESTION 5D. Consider the following (buggy!) implementation of a spinlock:

//This variable holds the spinlock.
std::atomic<unsigned> m = 0; //Initially unheld
void lock() {
    m++; //Atomic read/modify/write
    while (m != 1) { //Atomic read; note that the read
                     //might fetch a value greater than
                     //1 if multiple threads are trying
                     //to simultaneously grab the lock!
        m = 0;  //Atomic write
        pause();
        m++;  //Atomic read/modify/write
    }
    //The hope is that, when `lock()` returns,
    //m is 1, indicating that a single thread
    //owns the lock.
}

void unlock() {
    m--; //Atomic read/modify/write.
}

Explain why this implementation is broken, giving a concrete example of how two threads can both own the lock simultaneously. Hint: In your example, assume that, at the beginning, a thread T1 calls lock() with no other threads active, such that m is set to 1. Devise a bad interleaving of instructions from threads T2 and T3 who subsequently call lock() while T1 is still in the critical section.

9pt T1 grabs the lock, setting m to 1. T2 increments m to 2, enters the while loop, and sets m to 0. T3 then increments m to 1, and in the while test sees that m does equal 1 and exits with the lock. T1 and T3 both have the lock now! [Note that there are a variety of other ways to cause a problem; this lock implementation is very bad LOL. For example, an even simpler lack of mutual exclusion is caused if T1 grabs the lock, and then T2 tries to grab the lock. T2 will increment m to 2, enter the while loop, set m to 0, pause, and then increment m to 1 and go to the top of the loop. At this point, m will equal 1, so T2 will exit the loop, such that both T1 and T2 have the lock!]