CS 61
  • Info
    Concepts Course Description Course Staff Extension Resources Schedule Setup: GitHub Setup: Docker Setup: VM Style C and C++ Patterns Git Diff GDB Introduction GDB Commands File Descriptors HTTP Test Information Test 1 Test 2 Final Midterm 2023 Final 2023 Midterm 2022 Final 2022 Exercises: Miscellaneous
  • Problem sets
    Problem set 0 Problem set 1 Problem set 2 Problem set 3 Problem set 4 Problem set 5 Problem set 5 EC: Interruption Problem set 6 Problem set 6 EC Data representation exercises Assembly exercises Kernel exercises Storage exercises Process control exercises Synchronization exercises
  • Lectures
    Textbook Data representation Data representation 1: Introduction Data representation 2: Sizes and layout Data representation 3: Layout and dynamic allocation Data representation 4: Dynamic allocation, invariants Assembly Kernel Kernel 1: Processes, isolation, robustness Kernel 2: Virtual memory Kernel 3: Virtual memory and page tables Storage Storage 1: OS input/output, memory hierarchy Storage 2: Cache optimizations and coherence Storage 3: Eviction policies and coherence Process control Processes 1: Basics Processes 2: Inter-process communication Processes 3: Interruption and race conditions EthiCS: Harms and Unicode EthiCS: UTF-8 Synchronization Synchronization 1: Threads and atomicity Synchronization 2: Critical sections, serializability, lock granularity Synchronization 5: Databases Synchronization 6: Databases and Experiences
  • Sections
    Times & locations Datarep Section 1: C++ data structures Datarep Section 2: Memory bugs Assembly Section 1: Fun Datarep/Assembly Exercises Section Kernel Section 1: WeensyOS Kernel Section 2: Kernel exercises Kernel Section 3: Pipes Storage Section 1: Single-slot cache Process Section 1 and Storage Exercises Process Supplement: BNF grammars Process Section 2: Process control exercises Synchronization Section 1: Threads and synchronization objects Synchronization Section 2: Synchronization Exercises

2022 Final

Show all solutions Hide all solutions

Rules

The exam is open book, open note, open computer. You may access the book and your own notes. You may also use computers or electronic devices to access your own class materials and public class materials. Specifically:

  • You may access a browser and a PDF reader.
  • You may access your own notes and problem set code electronically.
  • You may access Internet sites on which your own notes and problem set code are stored.
  • You may access the course site.
  • You may access pages directly linked from the course site, including our lecture notes, exercises, and section notes.
  • You may access our lecture, problem set, and section code.
  • You may run a C or C++ compiler, including an assembler and linker.
  • You may access manual pages and common utilities, including calculators and a Python interpreter.

However:

  • You may not access class discussion forums, such as the Edboard, or other question forums, such as Stack Overflow.
  • You may not access an on-line disassembler or decompiler.
  • You may not access solutions from any previous exam, by paper or computer, except for those on the course site.
  • You may not broadly search the Internet for answers to questions or access other courses’ materials. Stick to our course site.
  • You absolutely may not contact humans or AI assistants with questions about the exam—whether in person, via IM, GitHub Copilot, or ChatGPT, or by posting questions to forums. The single exception is that you may contact course staff.

Any violations of this policy are breaches of academic honesty and will be treated accordingly. Please appreciate our flexibility and behave honestly and honorably.

Students are taking the exam at different times. Do not post publicly about the exam until given permission to do so by course staff. This includes general comments about exam difficulty.

Questions

If you have questions during the exam, write us email at cs61-staff@g.harvard.edu. Don’t use the Edboard. We may not notice Edboard posts, and you aren’t supposed to access the Edboard anyway.

Completing your exam

You have 3 hours to complete the exam starting from the time you press Start. Different students may have different exams. Enter your answers directly in this form. A green checkmark appears when an answer is saved. (Click outside a text box to save the answer.)

Students with extra time accommodations: If your extra time is not reflected next to the start button, contact course staff and wait for confirmation before pressing it.

You should not need to attach a drawing or other longer notes, but you may do so by adding them to a final subdirectory in your cs61-f24-psets repository. Make sure you push your changes to GitHub before time is up, and explain in this form where we should look at your notes.

There is no need to explicitly “submit” the exam. Your changes are automatically saved with timestamps. We’ll grade the latest version saved within your exam window.

Notes

Assume a Linux operating system running on the x86-64 architecture unless otherwise stated. If you get stuck, move on to the next question. If you’re confused, explain your thinking briefly for potential partial credit.

1. Data representation

Answer the following questions about objects A–L in this C++ program. “Object K” is the nameless object pointed to by L.

#define FLAG_1  1
#define FLAG_2  2
#define FLAG_4  4

const char* A = "five";
char B = sizeof(A);
char C = 37 & 31;
char D = FLAG_1 | FLAG_4;
unsigned char E = 260;
int F = INT_MAX;
unsigned int G = UINT_MAX;

struct hstruct {
    short h1;
    int h2;
};

struct istruct {
    char i1[2];
    short i2[2];
};

int main() {
    hstruct H;
    istruct I;
    int J[3];
    // NB: “K” is the nameless object L points to
    char* L = new char[4];
    memset(L, 0xFF, 4);
}

QUESTION 1A. Which of the objects A–L have the exact same data representation in memory?

3pt C and D, and G and K. Below is the data representation for all of the objects, generated with hexdump.

A     4000004020  9e 20 00 00 40 00 00 00                         
B     400000401b  08                                             
C     400000401a  05                                                
D     4000004019  05                                                
E     4000004018  04                                                
F     4000004014  ff ff ff 7f                                      
G     4000004010  ff ff ff ff                                       
H     400180558c  00 00 00 00 8c 58 80 01                          
I     4001805596  00 00 64 00 00 00                                
J     400180559c  00 00 00 00 00 10 00 00  00 00 00 00           
L     4001805580  a0 52 00 00 40 00 00 00                               
K     ..........  ff ff ff ff

QUESTION 1B. Which of the objects A–L have a size of 8 bytes?

2pt A, H, and L.

QUESTION 1C. Which of the objects A–L have an alignment of 4?

2pt F, G, H, and J.

QUESTION 1D. Which of the objects A–L have dynamic lifetime?

1pt K

QUESTION 1E. Which of the objects A–L are stored in the data segment?

2pt B-G. (A is const so it is stored in the code segment.)

QUESTION 1F. Which of the following expressions, if placed in main, would produce undefined behavior?

  1. J[0] += 1;
  2. L[3] += 1;
  3. L[4] += 1;
  4. L += 4;
  5. F += 1;
  6. G += 1;
  7. None of the above

3pt 1, 3, and 5. 1 produces undefined behavior because J[0] is a local variable, which does not get initialized by default. Signed integer overflow (5) is undefined behavior, but unsigned integer overflow (6) is not. (3) modifies an out-of-bounds index.

QUESTION 1G. Which of the objects A–L will be freed when the main function returns?

1pt All of them! When main returns, the process exits, and all its memory is reclaimed.

QUESTION 1H. If the program is compiled with leak sanitization, which of the objects A–L will generate sanitizer reports when the program is run?

1pt K, which is not explicitly freed.

2. Kernel memory translation

In the first parts of this question, you will design and implement a WeensyOS system call, sys_vtop, that returns the physical address corresponding to a virtual address accessible to the calling process. Here’s the description of sys_vtop from u-lib.hh:

// sys_vtop(va)
//    Return the physical address mapped to `va` in this process’s address
//    space. Returns `(uintptr_t) -1` if `va` is not mapped in this process’s
//    address space, or if `va` is not accessible to the process.

uintptr_t sys_vtop(uintptr_t va);

QUESTION 2A. Write out a definition for the system call wrapper function sys_vtop. (This is the user-level function defined in u-lib.hh that wraps the syscall instruction; see u-lib.hh for other examples.) Document any assumptions you make.

1pt

uintptr_t sys_vtop(uintptr_t va) {
    return make_syscall(SYSCALL_VTOP, va);
}

I’m assuming the system call number is SYSCALL_VTOP.

QUESTION 2B. In which source files and/or functions will the code for the implementation of the sys_vtop system call be placed?

2pt In kernel.cc, as a case in syscall.

QUESTION 2C. When the kernel implementation for this system call starts running, what expression will hold the value of the va argument provided by the process?

1pt current->regs.reg_rdi

QUESTION 2D. Write the kernel implementation for this system call. You may want to refer to the vmiter interface. Document any assumptions you make. (Hint: Remember that the console is accessible to the process!)

2pt This code could go in syscall:

case SYSCALL_VTOP: {
    uintptr_t va = current->regs.reg_rdi;
    if (va >= MEMSIZE_VIRTUAL) {
        return -1;
    }
    vmiter it{current->pagetable, va};
    return it.user() ? it.pa() : -1;
}

QUESTION 2E. Give an example virtual address va where, in any process in any WeensyOS, sys_vtop(va) == va.

1pt The console virtual/physical address, CONSOLE_ADDR.

QUESTION 2F. In a correct WeensyOS, will sys_vtop ever return 0? Explain briefly.

1pt No. The 0 physical page is reserved and marked inaccessible in every page table.

QUESTION 2G. Any correct implementation of sys_vtop will obey certain laws enforced by x86-64’s virtual memory hardware, as well as laws imposed by the WeensyOS design. These laws could be expressed as assertions and used to (partially) validate sys_vtop, as in this check_vtop function:

uintptr_t check_vtop(uintptr_t va) {
    uintptr_t pa = sys_vtop(va);
    if (pa == (uintptr_t) -1) {
        return pa;
    }
    // Example assertion: Most x86-64 processors only support up to
    // 48-bit-long physical addresses.
    assert(pa < 0x0001'0000'0000'0000UL); // hardware
    // YOUR ASSERTIONS HERE
    return pa;
}

Write more assertions that will always hold for a correct sys_vtop. Mark each assertion as hardware or Weensy, depending on if the law is true because of hardware or WeensyOS constraints. There are many valid assertions; for full credit, you’ll need at least one hardware assertion and two independent Weensy assertions. (Note that an assertion like “assert(A && B)” counts as two independent assertions.)

3pt

assert(pa >= 0x1000); // Weensy
assert(pa < MEMSIZE_PHYSICAL); // Weensy
assert((pa & 0xFFF) == (va & 0xFFF)); // hardware

QUESTION 2H. This WeensyOS kernel function tests whether a physical address pa is part of the memory used to represent a particular page table.

bool is_page_table_memory(uintptr_t pa, x86_64_pagetable* pgtbl) {
    for (ptiter it{pgtbl}; it.va() < MEMSIZE_VIRTUAL; it.next()) {
        if (pa >= it.pa() && pa < it.pa() + PAGESIZE) {
            return true;
        }
    }
    uintptr_t pgtbl_pa = (uintptr_t) pgtbl;
    return pa >= pgtbl_pa && pa < pgtbl_pa + PAGESIZE;
}

Complete the following WeensyOS kernel function, using is_page_table_memory as a subroutine.

// is_any_page_table_memory(uintptr_t pa)
//    Return true iff `pa` is part of the memory used to represent *any*
//    active page table in WeensyOS, including kernel and process page tables.
bool is_any_page_table_memory(uintptr_t pa) {

2pt

for (int i = 0; i != NPROC; ++i) {
    if (ptable[i].state != P_EMPTY && is_page_table_memory(pa, ptable[i].pagetable)) {
        return true;
    }
}
return is_page_table_memory(pa, kernel_pagetable);

QUESTION 2I. If a process calls sys_vtop(va), and the kernel computes the return value pa, then immediately before returning pa, the assertion assert(!is_any_page_table_memory(pa)) always holds. Why is this? Explain briefly, referring to important properties of kernels.

2pt If pa == -1, then pa is not a valid phsyical address and can be part of no page table. If pa != -1, then pa must not be used to represent a page table. pa is accessible by the process, and processes are never allowed to access memory used to represent page tables. This enforces process and kernel isolation.

QUESTION 2J. When a process calls sys_vtop(va) and the kernel returns pa, there are some situations when, after some events occur, the expression is_any_page_table_memory(pa) could return true. Describe such a situation (what events would have to occur between the kernel returning pa and the call to is_any_page_table_memory(pa)).

1pt If the process that called sys_vtop(va) exited, its process-accessible memory could be reused, including for page tables.

3. Isolation potpourri

QUESTION 3A. Only the kernel can disable timer interrupts. Why is this important for process isolation? Explain briefly.

2pt If a process could disable timer interrupts, it could run forever, preventing other processes from accessing machine resources.

QUESTION 3B. The x86-64 lcr3 instruction, which modifies the %cr3 register, can only be executed by the kernel. Why is this important for process isolation? Explain briefly.

2pt lcr3 modifies the current page table. If a process could change the currently executing page table, it could access and modify arbitrary memory, violating process isolation.

QUESTION 3C. All general-purpose x86-64 instructions that access memory use virtual addresses, rather than physical addresses. Why is this important for process isolation? Explain briefly.

2pt The kernel sets up each process’s page table to ensure that process can only access permitted memory. If a process could simply use physical addresses to access memory, it could bypass this protection.

QUESTION 3D. Any pointer passed to the kernel as part of a system call must be validated, usually using vmiter, before the kernel accesses the corresponding memory. Why is this important for process isolation? Explain briefly.

2pt The kernel can’t control what values a process passes to a system call. If the process passed an illegal value, such as a kernel pointer, or a pointer into reserved memory, the kernel might be tricked into performing an action that violated process isolation (the “confused deputy” problem).

4. Pipeline caching

This problem concerns a simple pipeline program1 | program2. Assume that all important file I/O in the pipeline takes place over the pipe itself, meaning that any other file I/O (like error printouts) has no impact on performance. (Perhaps program1 generates its output computationally—e.g., by printing a list of numbers from 0 to 1 billion—and program2 counts the number of bytes in its standard input.)

QUESTION 4A. Which programs in this pipeline might benefit from applying the prefetching strategy to its file I/O, program1, program2, both, or neither?

2pt program2 could benefit from prefetching. The prefetching strategy is useful for reads. program2 could benefit if the program reads the standard input a byte at a time; stdio would read 4,096 bytes, thus reducing the number of system calls.

QUESTION 4B. Which programs might benefit from applying the batching strategy to its file I/O, program1, program2, both, or neither?

2pt Both could use batching to reduce system calls.

QUESTION 4C. Which programs might benefit from applying the write coalescing strategy to its file I/O, program1, program2, both, or neither? Explain briefly.

2pt program2 will not benefit because it doesn’t write at all. Arguably program1 could benefit because batching will turn many short writes into one longer write. However, if you use the strict definition from class, in write coalescing redundant writes are eliminated by writing only the final version. In a pipeline there are no redundant writes because seeks aren’t allowed.

QUESTION 4D. If program1 and program2 both use standard I/O, then three file I/O caches might be relevant to the performance of the program1 | program2 pipeline: program1’s stdio cache, the kernel’s pipe buffer, and program2’s stdio cache.

Describe an access pattern where all of these caches are helpful—in other words, where removing any of these caches would likely slow down the pipeline.

Example access patterns might include “sequential byte I/O”, meaning program1 writes one byte at a time to stdout and program2 reads one byte at a time from stdin; and “sequential 512-byte block writes and sequential byte reads”, meaning program1 writes 512 bytes at a time to stdout and program2 reads one byte at a time from stdin. An access pattern might also specify that a program pauses for some amount of time and/or flushes its write cache between its I/O accesses.

2pt Sequential byte reads and sequential byte writes.

QUESTION 4E. Describe an access pattern where program1’s stdio cache is helpful, but program2’s stdio cache is irrelevant to performance.

2pt Sequential byte writes and sequential 4096-byte reads.

QUESTION 4F. Describe an access pattern where the kernel’s pipe buffer is irrelevant to performance, or explain why such an access pattern is unlikely to exist.

1pt In a trivial access pattern where program1 writes no data and exits, the pipe buffer is irrelevant!

Also, in an access pattern where program1 writes a byte at a time, flushes, and then pauses, the size of the pipe buffer is irrelevant.

QUESTION 4G. Reverse sequential I/O is not possible when reading or writing a pipe. Why not?

1pt Pipes are streaming files, and do not support the lseek system call.

5. Cache and system call potpourri

QUESTION 5A. Match each cache or cache-like object with the most relevant size. You will use each size at most once.

  1. Buffer cache
  2. Processor cache line
  3. Processor cache
  4. Stdio cache
  1. 64
  2. 1024
  3. 4096
  4. 131,072
  5. 8,589,934,592

3pt 1–E, 2–A, 3–D, 4–C

QUESTION 5B. The mmap system call returns MAP_FAILED to indicate an error. Although mmap returns a pointer type, MAP_FAILED is not equal to nullptr. What does it equal and why? Select the best answer.

  1. (void*) 1, because 1 is not a page-aligned address, and mmap always returns a page-aligned address.
  2. (void*) 0xFFFF'FFFF, because system calls that fail return -1.
  3. (void*) ~0UL, because system calls that fail return -1.
  4. (void*) 0x8000'0000'0000'0000UL, because system calls that fail return negative numbers.

2pt 3.

QUESTION 5C. Unix system calls have two return values: the actual return value, and, in case of error, the error code to be stored in errno. We can investigate the calling convention for these return values using an example program and gdb. Here is the start of such an example program:

#include <cassert>
#include <cstdio>
#include <cerrno>
#include <unistd.h>
#include <fcntl.h>

int main() {
    int f = open("/dev/null", O_RDONLY);
    fprintf(stderr, "returns %d, errno %d\n", f, errno);

    /* YOUR CODE HERE */  // syscall always fails (has error return value)
    fprintf(stderr, "returns %d, errno %d\n", f, errno);
}

Write a system call that always fails when executed at /* YOUR CODE HERE */ in this example. (You may assume the open always succeeds.)

2pt Example: write(f, ".", 1), because f is open read-only.

QUESTION 5D. Using gdb to examine the assembly inside the open system call wrapper, we see this:

=> 0x7ffff7df26f7 <open64+39>:  mov    $0xffffffffffffff9c,%rdi
   0x7ffff7df26fe <open64+46>:  syscall 
   0x7ffff7df2700 <open64+48>:  cmp    $0xfffffffffffff000,%rax
   0x7ffff7df2706 <open64+54>:  ja     0x7ffff7df272d <open64+93>
   0x7ffff7df2708 <open64+56>:  retq 
...
   0x7ffff7df272d <open64+93>:  neg    %eax
   0x7ffff7df272f <open64+95>:  mov    %eax,0x20b9ab(%rip)        # 0x7ffff7ffe0e0 <rtld_errno>
   0x7ffff7df2735 <open64+101>: mov    $0xffffffff,%eax
   0x7ffff7df273a <open64+106>: retq   

Based on this assembly, how does the user-level system call wrapper differentiate successful returns from error returns, and how does it extract the error code? Refer to specific instructions (such as +48 for instruction <open64+48>).

3pt Unsigned return values greater than 0xffff'ffff'ffff'f000 indicate errors. For such values, the negative of that value is the error code (<open64+93>).

6. Pipe experiments

Recall that a pipe is represented by a pair of file descriptors, the read end and the write end, where data written to the write end can be read from the read end. The two ends are initially two file descriptors in a single process, but fork and dup2 can cause multiple processes to have access to the ends.

QUESTION 6A. When does reading from the read end of a pipe block?

1pt When no data has been written to the pipe, or when all data previously written to the pipe has been read.

QUESTION 6B. When does reading from the read end of a pipe return end-of-file (0)?

1pt When all write ends of the pipe have been closed.

QUESTION 6C. When does writing to the write end of a pipe block?

1pt When the kernel’s pipe buffer is full.

QUESTION 6D. Elizabeth Bishop has written a test program to investigate writing to pipes. Here’s her program. It writes bytes to a pipe one at a time; after each byte, it overwrites the file info.txt with the number of bytes written so far.

// program `bishop22`
#include <cstdio>
#include <cassert>
#include <unistd.h>

void writesize(size_t n) {
    FILE* f = fopen("info.txt", "w");
    fprintf(f, "%zu\n", n);
    fclose(f);
}

int main() {
    size_t n = 0;
    writesize(n);
    while (true) {
        ssize_t nw = write(STDOUT_FILENO, ".", 1);
        assert(nw == 1);
        ++n;
        writesize(n);
    }
}

She tests this program by running ./bishop22 | sleep 2000. After about 20 seconds, she hits Control-C, killing both programs in the pipeline. What will she see in info.txt, and/or what is the meaning of that value?

1pt "65536\n": the size of the pipe buffer.

QUESTION 6E. The command line ./bishop22 | false exits immediately, and the info.txt file contains "0\n". However, the assertion on line 17 never triggers! What debugging technique from class would you suggest for investigating this behavior?

1pt strace (or gdb)!

QUESTION 6F. Briefly explain this behavior (why in ./bishop22 | false, ./bishop22 writes "0\n" and then terminates).

2pt When a process writes to a pipe whose read end is closed, the kernel delivers that process a SIGPIPE signal, which by default kills that process. So ./bishop22 writes "0\n" and then terminates by SIGPIPE.

QUESTION 6G. Now she runs ./bishop22a | false, where bishop22a uses stdio functions instead of write:

// program `bishop22a`
#include <cstdio>
#include <cassert>
#include <unistd.h>

void writesize(size_t n) {
    FILE* f = fopen("info.txt", "w");
    fprintf(f, "%zu\n", n);
    fclose(f);
}

int main() {
    size_t n = 0;
    writesize(n);
    while (true) {
        int ch = fputc('.', stdout);   // ← DIFFERENCE
        assert(ch == '.');
        ++n;
        writesize(n);
    }
}

After running this program, info.txt contains "4096\n", not "0\n"! What’s causing this difference? Explain briefly.

2pt The stdio cache, a single-slot 4096-byte cache, saves 4096 character writes in its cache before attempting to write anything to stdout. Thus, the first chance bishop22a has to detect that the pipe is closed is after 4096 writes.

QUESTION 6H. Bishop notices that if she kills a command line like ./bishop22 | wc using Control-C, the info.txt file often contains a number of characters, but it is also often empty (zero-length). She considers this behavior a bug and decides to write a poem about this bug. What’s a good title for the poem?

  1. The Lost Wakeup
  2. The Undefined Behavior
  3. The Kernel Crash
  4. The Race Condition
  5. The Fish
  6. The Assertion Failure
  7. The Man-Moth

1pt 4. This is a race condition because it is a behavior difference dependent on scheduling.

QUESTION 6I. Pressing Control-C causes the kernel to deliver the SIGINT signal, which by default terminates a process, to each process in the command line. Give a location in ./bishop22 where delivering SIGINT would leave info.txt empty, referring to specific line numbers.

1pt Between lines 7 and 9. We did not accept answers suggesting that SIGINT is delivered before the first writesize call since there is no guarantee that info.txt exists and is nonempty before we run the program.

QUESTION 6J. The rename(oldpath, newpath) system call has interesting atomicity property (from man 2 rename):

“If newpath already exists, it will be atomically replaced, so that there is no point at which another process attempting to access newpath will find it missing.”

Change the writesize function to use rename, and thereby avoid the empty-info.txt bug. (Side note: Editors like vi and emacs use this rename technique all the time!)

2pt

void writesize(size_t n) {
    FILE* f = fopen("info.txt~", "w");
    fprintf(f, "%zu\n", n);
    fclose(f);
    rename("info.txt~", "info.txt");
}

7. Synchronization

TicketBlaster has 2 million tickets to sell for a Taylor Swift concert tour. Their web site implements each customer connection using a thread. TicketBlaster plans to let 3.5 billion customers visit the website, so not every customer that wants a ticket will get one.

Here’s their bad code.

#include <thread>
...

size_t available_tickets = 2'000'000;
size_t ticket_price = ...;

const char* try_purchase(size_t ntickets) {
    if (available_tickets > ntickets) {
        available_tickets -= ntickets;
        charge_customer_helper(ntickets * ticket_price);
        return "Success";
    } else {
        return "You lose";
    }
}

void customer_thread(int fd) {
    // read number of tickets desired by customer
    size_t ntickets = receive_ntickets_from_customer_helper(fd);
    const char* message = try_purchase(ntickets);
    send_message_to_customer_helper(fd, message);
    close_customer_helper(fd);
}

int main() {
    while (true) {
        int fd = accept_new_customer_helper();
        std::thread th(customer_thread, fd);
        th.join();
    }
}

(You may assume that the *_helper functions perform their indicated functions correctly, but feel free to document any assumptions you make.)

QUESTION 7A. Write an assert statement (that is, an invariant) involving available_tickets that should hold at all times.

2pt assert(available_tickets >= 0 && available_tickets <= 2000000)

QUESTION 7B. TicketBlaster calls you in because their server is processing customers much slower than they anticipated. What is the problem?

1pt This code processes one customer at a time because of th.join().

QUESTION 7C. TicketBlaster “solves” this problem with a one-line change to main. The altered server is much faster, but has tons of undefined behavior. What debugging tool from class would you suggest TicketBlaster use to track down this undefined behavior?

1pt Sanitizers.

QUESTION 7D. What is the source of the undefined behavior?

1pt Data races on available_tickets.

QUESTION 7E. Describe how to fix the undefined behavior. For full credit, give the new code and/or variable declarations required.

1pt Add a std::mutex to protect available_tickets, and acquire that mutex using std::scoped_lock/unique_lock in try_purchase.

QUESTION 7F. (Aside; 1pt) Even after the undefined behavior is fixed, TicketBlaster’s server cannot sell all 2,000,000 tickets. Why not?

1pt It can sell 1,999,999; the > ntickets should be >= ntickets.

QUESTION 7G. TicketBlaster wants to improve the functionality of its server. Rather than immediately buying tickets or failing, a customer thread should:

  1. Wait for ntickets to become available. (If the show is sold out, it’s OK to wait forever.)
  2. Reserve ntickets at the current ticket_price.
  3. Wait up to 10 minutes for the customer to confirm their purchase.
  4. If confirmed, purchase the ntickets at the reserved ticket_price. (This step should never fail, because the tickets were reserved.)
  5. If not confirmed, cancel the reservation, making the ntickets available for other customers.

Here’s the code they’ve got so far:

void new_customer_thread(int fd) {
    size_t ntickets = receive_ntickets_from_customer_helper(fd);
    // Step 1: Wait for `ntickets` to become available
    /* ???????????? */
    // Step 2: Reserve at the current price
    available_tickets -= ntickets;
    size_t reserved_price = ticket_price;
    // Step 3: Wait for confirmation (up to 10 minutes)
    bool confirmed = wait_for_confirmation_helper(fd);
    if (confirmed) {
        // Step 4: Purchase at reserved price
        charge_customer_helper(ntickets * reserved_price);
        send_message_to_customer_helper("Success");
    } else {
        // Step 5: Cancel
        available_tickets += ntickets;
        send_message_to_customer_helper("Cancelled");
    }
}

Fix this code to implement proper synchronization. For full credit, your code should block in Step 1 (rather than poll), but it should only block when fewer than ntickets are available.

4pt

// Step 1
m.lock();
while (available_tickets < ntickets) {
    cv.wait(m);
}
...
// Step 3
m.unlock();
bool confirmed = ...
    // Step 5: Cancel
    m.lock();
    available_tickets += ntickets;
    cv.notify_all(m);
    m.unlock();

8. Zombiepile

QUESTION 8A. Which of the following errors are non-recoverable? List all that apply.

  1. EINTR
  2. EINVAL
  3. EIO
  4. EAGAIN
  5. ECHILD

1pt EINVAL, EIO, ECHILD

QUESTION 8B. When a child process terminates, it becomes a “zombie” until the parent collects its termination status using waitpid. Consider the following code:

int status;
pid_t r = waitpid(p, &status, 0);
assert(r == p);  // child process `p` terminated

int status2;
pid_t r2 = waitpid(p, &status2, 0);
???????

Assuming that the first assertion passes, which of the following outcomes are likely and which outcomes are possible for the second waitpid call?

  1. Immediately returns r2 == p && status2 == status: The kernel delivers the exited process’s status again.
  2. r2 == 0: No child process with ID p has exited.
  3. r2 == p && WIFSTOPPED(status2): The zombie process has now stopped.
  4. r2 == -1 && errno == ECHILD: No process named p is a child any more.
  5. Eventually returns r2 == p, but possibly with a different status: After the first waitpid, another thread in the parent process created a new child process that coincidentally reused ID p.
  6. None of the above.

2pt Likely: 4. Possible (but extraordinarily unlikely): 5. Impossible: 1, 2, 3.

QUESTION 8C. The following two code snippets intend to wait until either process p terminates, or a non-recoverable error occurs. Explain at a high level any differences between them, in terms of performance or functionality. Which version would you prefer? (Just a couple sentences, please.)

1.    pid_t r = 0;
      int status;
      while (r == 0 || (r == -1 && is_recoverable_error(errno))) {
          r = waitpid(p, &status, 0);
      }


2.    pid_t r = 0;
      int status;
      while (r == 0 || (r == -1 && is_recoverable_error(errno))) {
          r = waitpid(p, &status, WNOHANG);
      }

2pt #1 blocks and #2 polls! I’d generally prefer #1 because it wastes less CPU time and energy.

QUESTION 8D. Complete the following function, which waits until any of the processes in the pids array terminates or a non-recoverable error occurs.

You may assume that every element of pids is positive, and you may use the is_recoverable_error helper function from part B. You must not collect the termination status of any child process other than those in pids. (This is because a different function, or even a different thread, might be interested in other child processes’ termination statuses.) Use a polling, rather than blocking, approach.

// waitpile(pids, npids, status)
//    Waits until one of the processes with ids in `pids[0...npids-1]`
//    terminates or a non-recoverable error occurs. Returns the terminated
//    process’s ID (and stores its termination status in *status), or
//    returns -1 on error.

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    ...

2pt

while (true) {
    for (size_t i = 0; i != npids; ++i) {
        pid_t r = waitpid(pids[i], status, WNOHANG);
        if (r != 0 && (r != -1 || !is_recoverable_error(errno))) {
            return r;
        }
    }
}

QUESTION 8E. The waitpile function cannot be implemented in a blocking fashion unless you introduce an auxiliary data structure. Why not? List all that apply.

  1. Only system calls can block, and waitpile is not a system call.
  2. waitpid can block, but only for one specific process at a time.
  3. waitpile must not collect the termination status for child processes not listed in pids, but waitpid(-1) can collect any child’s status.
  4. None of the above.

1pt 3.

QUESTION 8F. Bernadette Mayer needs to implement a blocking version of waitpile, so she designs an auxiliary data structure called a zombiepile. Any time her process waits for one or more children, it will use the zombiepile, which looks like this:

struct zombiepile {
    std::map<pid_t, int> status_map;

    // Adds `p` to the zombiepile with termination status `status`.
    void add(pid_t p, int status) {
        assert(!this->status_map.contains(p));
        this->status_map.insert({p, status});
    }

    // If `p` is contained in the zombiepile, set `*status` to `p`’s
    // termination status, remove `p` from the zombiepile, and return `p`.
    // Otherwise return 0.
    pid_t check(pid_t p, int* status) {
        auto it = this->status_map.find(p);
        if (it == this->status_map.end()) {
            return 0;
        }
        *status = it->second;
        this->status_map.erase(it);
        return p;
    }
};

extern zombiepile zp;   // single global zombiepile for the whole process

Write a blocking version of waitpile that uses zp. Your code will call zp.add and zp.check as well as waitpid. As before, waitpile should return as soon as any of the children in pids terminates. You may assume that waitpid never returns an error (it never returns –1), and you may assume that waitpile is called from a single-threaded process.

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    ...

3pt Assuming waitpid never returns an error:

while (true) {
    for (size_t i = 0; i != npids; ++i) {
        if (zp.check(pids[i], status)) {
            return pids[i];
        }
    }
    int xstatus;
    pid_t r = waitpid(-1, &xstatus, 0);
    if (r > 0) {
        zp.add(r, xstatus);
    }
}

Assuming it might return an error:

while (true) {
    for (size_t i = 0; i != npids; ++i) {
        if (zp.check(pids[i], status)) {
            return pids[i];
        }
    }
    for (size_t i = 0; i != npids; ++i) {
        pid_t r = waitpid(pids[i], status, WNOHANG);
        if (r == pids[i] || (r == -1 && !is_recoverable_error(errno))) {
            return r;
        }
    }
    int xstatus;
    pid_t r = waitpid(-1, &xstatus, 0);
    if (r > 0) {
        zp.add(r, xstatus);
    }
}

QUESTION 8G. In a multithreaded process, any thread can create a new child process, and any thread can wait for any child process. That means that zombiepile needs synchronization! Give an example of a problem that could occur if zombiepile lacked proper synchronization.

1pt Data race (specifically, on zombiepile::status_map)

QUESTION 8H. Add proper synchronization to zombiepile and waitpile. As before, you may assume that waitpid never returns an error.

(Hint: There are several ways to solve this problem, including a polling solution, a blocking solution involving one or more auxiliary threads and/or signal handlers, and a blocking solution without any auxiliary thread. A polling solution will get partial credit. Make sure you avoid lost wakeups—this can be tricky if you don’t use an auxiliary thread.)

3pt Polling solution:

struct zombiepile {
    std::mutex m;
    ...

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    std::unique_lock guard(zp.m);
    while (true) {
        for (size_t i = 0; i != npids; ++i) {
            if (zp.check(pids[i], status)) {
                return pids[i];
            }
        }
        guard.unlock();
        int xstatus;
        pid_t r = waitpid(-1, &xstatus, WNOHANG);
        guard.lock();
        if (r > 0) {
            zp.add(r, xstatus);
        }
    }
}

Blocking solution with auxiliary thread—a detached thread that just calls waitpid and adds the termination statuses to the zombiepile:

struct zombiepile {
    std::mutex m;
    std::condition_variable_any cv;
    ...


static void auxthread() {
    // Let’s assume the `auxthread` is created on the first child process.
    // It might exit when there are no more child processes & then get
    // restarted—or we can just use the assumption that `waitpid` never fails
    while (true) {
        int xstatus;
        pid_t p = waitpid(-1, &xstatus, 0);
        if (p > 0) {
            std::unique_lock guard(zp.m);
            zp.add(p, xstatus);
            zp.cv.notify_all();
        }
    }
}

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    std::unique_lock guard(zp.m);
    while (true) {
        for (size_t i = 0; i != npids; ++i) {
            if (zp.check(pids[i], status)) {
                return pids[i];
            }
        }
        zp.cv.wait(guard);
    }
}

Blocking solution without auxiliary thread—note the care required to ensure that exactly one thread calls waitpid at a time (because if more than one thread called waitpid, one of them might experience a lost wakeup):

struct zombiepile {
    std::mutex m;
    std::condition_variable_any cv;
    bool has_waiter;
...

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    std::unique_lock guard(zp.m);
    while (true) {
        for (size_t i = 0; i != npids; ++i) {
            if (zp.check(pids[i], status)) {
                return pids[i];
            }
        }
        for (size_t i = 0; i != npids; ++i) { // only req'd for error checking
            pid_t r = waitpid(pids[i], status, WNOHANG);
            if (r == pids[i] || (r == -1 && !is_recoverable_error(errno))) {
                return r;
            }
        }
        if (!zp.has_waiter) {
            zp.has_waiter = true;
            guard.unlock();
            int xstatus;
            pid_t r = waitpid(-1, &xstatus, 0);
            guard.lock();
            if (r > 0) {
                zp.add(r, xstatus);
                zp.cv.notify_all();
            }
            zp.has_waiter = false;
        } else {
            zp.cv.wait(guard);
        }
    }
}

9. The end

QUESTION 9A. What improvements would you suggest to the class? Any answer but no answer will receive full credit.

5pts for A-C

QUESTION 9B. Which topics do you wish we had spent more time on? Any answer but no answer will receive full credit.

QUESTION 9C (0 points). Assuming you weren’t at the final review session, did you know that we have a TF named Jeremy?

Thank you for taking the course!