Final – CS 61

Show all solutions Hide all solutions

Rules

The exam is open book, open note, open computer. You may access the book and your own notes. You may also use computers or electronic devices to access your own class materials and public class materials. Specifically:

You may access a browser and a PDF reader.
You may access your own notes and problem set code electronically.
You may access Internet sites on which your own notes and problem set code are stored.
You may access the course site.
You may access pages directly linked from the course site, including our lecture notes, exercises, and section notes.
You may access our lecture, problem set, and section code.
You may run a C or C++ compiler, including an assembler and linker.
You may access manual pages and common utilities, including calculators and a Python interpreter.

However:

You may not access class discussion forums, such as the Edboard, or other question forums, such as Stack Overflow.
You may not access an on-line disassembler or decompiler.
You may not access solutions from any previous exam, by paper or computer, except for those on the course site.
You may not broadly search the Internet for answers to questions or access other courses’ materials. Stick to our course site.
You absolutely may not contact humans or AI assistants with questions about the exam—whether in person, via IM, GitHub Copilot, or ChatGPT, or by posting questions to forums. The single exception is that you may contact course staff.

Any violations of this policy are breaches of academic honesty and will be treated accordingly. Please appreciate our flexibility and behave honestly and honorably.

Students are taking the exam at different times. Do not post publicly about the exam until given permission to do so by course staff. This includes general comments about exam difficulty.

Questions

If you have questions during the exam, write us email at cs61-staff@g.harvard.edu. Don’t use the Edboard. We may not notice Edboard posts, and you aren’t supposed to access the Edboard anyway.

Completing your exam

You have 3 hours to complete the exam starting from the time you press Start. Different students may have different exams. Enter your answers directly in this form. A green checkmark appears when an answer is saved. (Click outside a text box to save the answer.)

Students with extra time accommodations: If your extra time is not reflected next to the start button, contact course staff and wait for confirmation before pressing it.

You should not need to attach a drawing or other longer notes, but you may do so by adding them to a final subdirectory in your cs61-f22-psets repository. Make sure you push your changes to GitHub before time is up, and explain in this form where we should look at your notes.

There is no need to explicitly “submit” the exam. Your changes are automatically saved with timestamps. We’ll grade the latest version saved within your exam window.

Notes

Assume a Linux operating system running on the x86-64 architecture unless otherwise stated. If you get stuck, move on to the next question. If you’re confused, explain your thinking briefly for potential partial credit.

1. Data representation (15 points)

Answer the following questions about objects A–L in this C++ program. “Object K” is the nameless object pointed to by L.

#define FLAG_1  1
#define FLAG_2  2
#define FLAG_4  4

const char* A = "five";
char B = sizeof(A);
char C = 37 & 31;
char D = FLAG_1 | FLAG_4;
unsigned char E = 260;
int F = INT_MAX;
unsigned int G = UINT_MAX;

struct hstruct {
    short h1;
    int h2;
};

struct istruct {
    char i1[2];
    short i2[2];
};

int main() {
    hstruct H;
    istruct I;
    int J[3];
    // NB: “K” is the nameless object L points to
    char* L = new char[4];
    memset(L, 0xFF, 4);
}

QUESTION 1A. Which of the objects A–L have the exact same data representation in memory?

3pt C and D, and G and K. Below is the data representation for all of the objects, generated with hexdump.

A     4000004020  9e 20 00 00 40 00 00 00                         
B     400000401b  08                                             
C     400000401a  05                                                
D     4000004019  05                                                
E     4000004018  04                                                
F     4000004014  ff ff ff 7f                                      
G     4000004010  ff ff ff ff                                       
H     400180558c  00 00 00 00 8c 58 80 01                          
I     4001805596  00 00 64 00 00 00                                
J     400180559c  00 00 00 00 00 10 00 00  00 00 00 00           
L     4001805580  a0 52 00 00 40 00 00 00                               
K     ..........  ff ff ff ff

QUESTION 1B. Which of the objects A–L have a size of 8 bytes?

QUESTION 1C. Which of the objects A–L have an alignment of 4?

QUESTION 1D. Which of the objects A–L have dynamic lifetime?

QUESTION 1E. Which of the objects A–L are stored in the data segment?

QUESTION 1F. Which of the following expressions, if placed in main, would produce undefined behavior?

J[0] += 1;
L[3] += 1;
L[4] += 1;
L += 4;
F += 1;
G += 1;
None of the above

QUESTION 1G. Which of the objects A–L will be freed when the main function returns?

QUESTION 1H. If the program is compiled with leak sanitization, which of the objects A–L will generate sanitizer reports when the program is run?

2. Kernel memory translation (16 points)

In the first parts of this question, you will design and implement a WeensyOS system call, sys_vtop, that returns the physical address corresponding to a virtual address accessible to the calling process. Here’s the description of sys_vtop from u-lib.hh:

// sys_vtop(va)
//    Return the physical address mapped to `va` in this process’s address
//    space. Returns `(uintptr_t) -1` if `va` is not mapped in this process’s
//    address space, or if `va` is not accessible to the process.

uintptr_t sys_vtop(uintptr_t va);

QUESTION 2A. Write out a definition for the system call wrapper function sys_vtop. (This is the user-level function defined in u-lib.hh that wraps the syscall instruction; see u-lib.hh for other examples.) Document any assumptions you make.

QUESTION 2B. In which source files and/or functions will the code for the implementation of the sys_vtop system call be placed?

QUESTION 2C. When the kernel implementation for this system call starts running, what expression will hold the value of the va argument provided by the process?

QUESTION 2D. Write the kernel implementation for this system call. You may want to refer to the vmiter interface. Document any assumptions you make. (Hint: Remember that the console is accessible to the process!)

2pt This code could go in syscall:

case SYSCALL_VTOP: {
    uintptr_t va = current->regs.reg_rdi;
    if (va >= MEMSIZE_VIRTUAL) {
        return -1;
    }
    vmiter it{current->pagetable, va};
    return it.user() ? it.pa() : -1;
}

QUESTION 2E. Give an example virtual address va where, in any process in any WeensyOS, sys_vtop(va) == va.

QUESTION 2F. In a correct WeensyOS, will sys_vtop ever return 0? Explain briefly.

QUESTION 2G. Any correct implementation of sys_vtop will obey certain laws enforced by x86-64’s virtual memory hardware, as well as laws imposed by the WeensyOS design. These laws could be expressed as assertions and used to (partially) validate sys_vtop, as in this check_vtop function:

uintptr_t check_vtop(uintptr_t va) {
    uintptr_t pa = sys_vtop(va);
    if (pa == (uintptr_t) -1) {
        return pa;
    }
    // Example assertion: Most x86-64 processors only support up to
    // 48-bit-long physical addresses.
    assert(pa < 0x0001'0000'0000'0000UL); // hardware
    // YOUR ASSERTIONS HERE
    return pa;
}

Write more assertions that will always hold for a correct sys_vtop. Mark each assertion as hardware or Weensy, depending on if the law is true because of hardware or WeensyOS constraints. There are many valid assertions; for full credit, you’ll need at least one hardware assertion and two independent Weensy assertions. (Note that an assertion like “assert(A && B)” counts as two independent assertions.)

QUESTION 2H. This WeensyOS kernel function tests whether a physical address pa is part of the memory used to represent a particular page table.

bool is_page_table_memory(uintptr_t pa, x86_64_pagetable* pgtbl) {
    for (ptiter it{pgtbl}; it.va() < MEMSIZE_VIRTUAL; it.next()) {
        if (pa >= it.pa() && pa < it.pa() + PAGESIZE) {
            return true;
        }
    }
    uintptr_t pgtbl_pa = (uintptr_t) pgtbl;
    return pa >= pgtbl_pa && pa < pgtbl_pa + PAGESIZE;
}

Complete the following WeensyOS kernel function, using is_page_table_memory as a subroutine.

// is_any_page_table_memory(uintptr_t pa)
//    Return true iff `pa` is part of the memory used to represent *any*
//    active page table in WeensyOS, including kernel and process page tables.
bool is_any_page_table_memory(uintptr_t pa) {

2pt

for (int i = 0; i != NPROC; ++i) {
    if (ptable[i].state != P_EMPTY && is_page_table_memory(pa, ptable[i].pagetable)) {
        return true;
    }
}
return is_page_table_memory(pa, kernel_pagetable);

QUESTION 2I. If a process calls sys_vtop(va), and the kernel computes the return value pa, then immediately before returning pa, the assertion assert(!is_any_page_table_memory(pa)) always holds. Why is this? Explain briefly, referring to important properties of kernels.

QUESTION 2J. When a process calls sys_vtop(va) and the kernel returns pa, there are some situations when, after some events occur, the expression is_any_page_table_memory(pa) could return true. Describe such a situation (what events would have to occur between the kernel returning pa and the call to is_any_page_table_memory(pa)).

3. Isolation potpourri (8 points)

QUESTION 3A. Only the kernel can disable timer interrupts. Why is this important for process isolation? Explain briefly.

QUESTION 3B. The x86-64 lcr3 instruction, which modifies the %cr3 register, can only be executed by the kernel. Why is this important for process isolation? Explain briefly.

QUESTION 3C. All general-purpose x86-64 instructions that access memory use virtual addresses, rather than physical addresses. Why is this important for process isolation? Explain briefly.

QUESTION 3D. Any pointer passed to the kernel as part of a system call must be validated, usually using vmiter, before the kernel accesses the corresponding memory. Why is this important for process isolation? Explain briefly.

4. Pipeline caching (12 points)

This problem concerns a simple pipeline program1 | program2. Assume that all important file I/O in the pipeline takes place over the pipe itself, meaning that any other file I/O (like error printouts) has no impact on performance. (Perhaps program1 generates its output computationally—e.g., by printing a list of numbers from 0 to 1 billion—and program2 counts the number of bytes in its standard input.)

QUESTION 4A. Which programs in this pipeline might benefit from applying the prefetching strategy to its file I/O, program1, program2, both, or neither?

QUESTION 4B. Which programs might benefit from applying the batching strategy to its file I/O, program1, program2, both, or neither?

QUESTION 4C. Which programs might benefit from applying the write coalescing strategy to its file I/O, program1, program2, both, or neither? Explain briefly.

QUESTION 4D. If program1 and program2 both use standard I/O, then three file I/O caches might be relevant to the performance of the program1 | program2 pipeline: program1’s stdio cache, the kernel’s pipe buffer, and program2’s stdio cache.

Describe an access pattern where all of these caches are helpful—in other words, where removing any of these caches would likely slow down the pipeline.

Example access patterns might include “sequential byte I/O”, meaning program1 writes one byte at a time to stdout and program2 reads one byte at a time from stdin; and “sequential 512-byte block writes and sequential byte reads”, meaning program1 writes 512 bytes at a time to stdout and program2 reads one byte at a time from stdin. An access pattern might also specify that a program pauses for some amount of time and/or flushes its write cache between its I/O accesses.

QUESTION 4E. Describe an access pattern where program1’s stdio cache is helpful, but program2’s stdio cache is irrelevant to performance.

QUESTION 4F. Describe an access pattern where the kernel’s pipe buffer is irrelevant to performance, or explain why such an access pattern is unlikely to exist.

QUESTION 4G. Reverse sequential I/O is not possible when reading or writing a pipe. Why not?

5. Cache and system call potpourri (10 points)

QUESTION 5A. Match each cache or cache-like object with the most relevant size. You will use each size at most once.

Buffer cache

Processor cache line

Processor cache

Stdio cache

64

1024

4096

131,072

8,589,934,592

QUESTION 5B. The mmap system call returns MAP_FAILED to indicate an error. Although mmap returns a pointer type, MAP_FAILED is not equal to nullptr. What does it equal and why? Select the best answer.

(void*) 1, because 1 is not a page-aligned address, and mmap always returns a page-aligned address.
(void*) 0xFFFF'FFFF, because system calls that fail return -1.
(void*) ~0UL, because system calls that fail return -1.
(void*) 0x8000'0000'0000'0000UL, because system calls that fail return negative numbers.

QUESTION 5C. Unix system calls have two return values: the actual return value, and, in case of error, the error code to be stored in errno. We can investigate the calling convention for these return values using an example program and gdb. Here is the start of such an example program:

#include <cassert>
#include <cstdio>
#include <cerrno>
#include <unistd.h>
#include <fcntl.h>

int main() {
    int f = open("/dev/null", O_RDONLY);
    fprintf(stderr, "returns %d, errno %d\n", f, errno);

    /* YOUR CODE HERE */  // syscall always fails (has error return value)
    fprintf(stderr, "returns %d, errno %d\n", f, errno);
}

Write a system call that always fails when executed at /* YOUR CODE HERE */ in this example. (You may assume the open always succeeds.)

QUESTION 5D. Using gdb to examine the assembly inside the open system call wrapper, we see this:

=> 0x7ffff7df26f7 <open64+39>:  mov    $0xffffffffffffff9c,%rdi
   0x7ffff7df26fe <open64+46>:  syscall 
   0x7ffff7df2700 <open64+48>:  cmp    $0xfffffffffffff000,%rax
   0x7ffff7df2706 <open64+54>:  ja     0x7ffff7df272d <open64+93>
   0x7ffff7df2708 <open64+56>:  retq 
...
   0x7ffff7df272d <open64+93>:  neg    %eax
   0x7ffff7df272f <open64+95>:  mov    %eax,0x20b9ab(%rip)        # 0x7ffff7ffe0e0 <rtld_errno>
   0x7ffff7df2735 <open64+101>: mov    $0xffffffff,%eax
   0x7ffff7df273a <open64+106>: retq

Based on this assembly, how does the user-level system call wrapper differentiate successful returns from error returns, and how does it extract the error code? Refer to specific instructions (such as +48 for instruction <open64+48>).

6. Pipe experiments (13 points)

Recall that a pipe is represented by a pair of file descriptors, the read end and the write end, where data written to the write end can be read from the read end. The two ends are initially two file descriptors in a single process, but fork and dup2 can cause multiple processes to have access to the ends.

QUESTION 6A. When does reading from the read end of a pipe block?

QUESTION 6B. When does reading from the read end of a pipe return end-of-file (0)?

QUESTION 6C. When does writing to the write end of a pipe block?

QUESTION 6D. Elizabeth Bishop has written a test program to investigate writing to pipes. Here’s her program. It writes bytes to a pipe one at a time; after each byte, it overwrites the file info.txt with the number of bytes written so far.

// program `bishop22`
#include <cstdio>
#include <cassert>
#include <unistd.h>

void writesize(size_t n) {
    FILE* f = fopen("info.txt", "w");
    fprintf(f, "%zu\n", n);
    fclose(f);
}

int main() {
    size_t n = 0;
    writesize(n);
    while (true) {
        ssize_t nw = write(STDOUT_FILENO, ".", 1);
        assert(nw == 1);
        ++n;
        writesize(n);
    }
}

She tests this program by running ./bishop22 | sleep 2000. After about 20 seconds, she hits Control-C, killing both programs in the pipeline. What will she see in info.txt, and/or what is the meaning of that value?

QUESTION 6E. The command line ./bishop22 | false exits immediately, and the info.txt file contains "0\n". However, the assertion on line 17 never triggers! What debugging technique from class would you suggest for investigating this behavior?

QUESTION 6F. Briefly explain this behavior (why in ./bishop22 | false, ./bishop22 writes "0\n" and then terminates).

QUESTION 6G. Now she runs ./bishop22a | false, where bishop22a uses stdio functions instead of write:

// program `bishop22a`
#include <cstdio>
#include <cassert>
#include <unistd.h>

void writesize(size_t n) {
    FILE* f = fopen("info.txt", "w");
    fprintf(f, "%zu\n", n);
    fclose(f);
}

int main() {
    size_t n = 0;
    writesize(n);
    while (true) {
        int ch = fputc('.', stdout);   // ← DIFFERENCE
        assert(ch == '.');
        ++n;
        writesize(n);
    }
}

After running this program, info.txt contains "4096\n", not "0\n"! What’s causing this difference? Explain briefly.

QUESTION 6H. Bishop notices that if she kills a command line like ./bishop22 | wc using Control-C, the info.txt file often contains a number of characters, but it is also often empty (zero-length). She considers this behavior a bug and decides to write a poem about this bug. What’s a good title for the poem?

The Lost Wakeup
The Undefined Behavior
The Kernel Crash
The Race Condition
The Fish
The Assertion Failure
The Man-Moth

QUESTION 6I. Pressing Control-C causes the kernel to deliver the SIGINT signal, which by default terminates a process, to each process in the command line. Give a location in ./bishop22 where delivering SIGINT would leave info.txt empty, referring to specific line numbers.

QUESTION 6J. The rename(oldpath, newpath) system call has interesting atomicity property (from man 2 rename):

“If newpath already exists, it will be atomically replaced, so that there is no point at which another process attempting to access newpath will find it missing.”

Change the writesize function to use rename, and thereby avoid the empty-info.txt bug. (Side note: Editors like vi and emacs use this rename technique all the time!)

7. Synchronization (11 points)

TicketBlaster has 2 million tickets to sell for a Taylor Swift concert tour. Their web site implements each customer connection using a thread. TicketBlaster plans to let 3.5 billion customers visit the website, so not every customer that wants a ticket will get one.

Here’s their bad code.

#include <thread>
...

size_t available_tickets = 2'000'000;
size_t ticket_price = ...;

const char* try_purchase(size_t ntickets) {
    if (available_tickets > ntickets) {
        available_tickets -= ntickets;
        charge_customer_helper(ntickets * ticket_price);
        return "Success";
    } else {
        return "You lose";
    }
}

void customer_thread(int fd) {
    // read number of tickets desired by customer
    size_t ntickets = receive_ntickets_from_customer_helper(fd);
    const char* message = try_purchase(ntickets);
    send_message_to_customer_helper(fd, message);
    close_customer_helper(fd);
}

int main() {
    while (true) {
        int fd = accept_new_customer_helper();
        std::thread th(customer_thread, fd);
        th.join();
    }
}

(You may assume that the *_helper functions perform their indicated functions correctly, but feel free to document any assumptions you make.)

QUESTION 7A. Write an assert statement (that is, an invariant) involving available_tickets that should hold at all times.

QUESTION 7B. TicketBlaster calls you in because their server is processing customers much slower than they anticipated. What is the problem?

QUESTION 7C. TicketBlaster “solves” this problem with a one-line change to main. The altered server is much faster, but has tons of undefined behavior. What debugging tool from class would you suggest TicketBlaster use to track down this undefined behavior?

QUESTION 7D. What is the source of the undefined behavior?

QUESTION 7E. Describe how to fix the undefined behavior. For full credit, give the new code and/or variable declarations required.

QUESTION 7F. (Aside; 1pt) Even after the undefined behavior is fixed, TicketBlaster’s server cannot sell all 2,000,000 tickets. Why not?

QUESTION 7G. TicketBlaster wants to improve the functionality of its server. Rather than immediately buying tickets or failing, a customer thread should:

Wait for ntickets to become available. (If the show is sold out, it’s OK to wait forever.)
Reserve ntickets at the current ticket_price.
Wait up to 10 minutes for the customer to confirm their purchase.
If confirmed, purchase the ntickets at the reserved ticket_price. (This step should never fail, because the tickets were reserved.)
If not confirmed, cancel the reservation, making the ntickets available for other customers.

Here’s the code they’ve got so far:

void new_customer_thread(int fd) {
    size_t ntickets = receive_ntickets_from_customer_helper(fd);
    // Step 1: Wait for `ntickets` to become available
    /* ???????????? */
    // Step 2: Reserve at the current price
    available_tickets -= ntickets;
    size_t reserved_price = ticket_price;
    // Step 3: Wait for confirmation (up to 10 minutes)
    bool confirmed = wait_for_confirmation_helper(fd);
    if (confirmed) {
        // Step 4: Purchase at reserved price
        charge_customer_helper(ntickets * reserved_price);
        send_message_to_customer_helper("Success");
    } else {
        // Step 5: Cancel
        available_tickets += ntickets;
        send_message_to_customer_helper("Cancelled");
    }
}

Fix this code to implement proper synchronization. For full credit, your code should block in Step 1 (rather than poll), but it should only block when fewer than ntickets are available.

4pt

// Step 1
m.lock();
while (available_tickets < ntickets) {
    cv.wait(m);
}
...
// Step 3
m.unlock();
bool confirmed = ...
    // Step 5: Cancel
    m.lock();
    available_tickets += ntickets;
    cv.notify_all(m);
    m.unlock();

8. Zombiepile (15 points)

QUESTION 8A. Which of the following errors are non-recoverable? List all that apply.

EINTR
EINVAL
EIO
EAGAIN
ECHILD

QUESTION 8B. When a child process terminates, it becomes a “zombie” until the parent collects its termination status using waitpid. Consider the following code:

int status;
pid_t r = waitpid(p, &status, 0);
assert(r == p);  // child process `p` terminated

int status2;
pid_t r2 = waitpid(p, &status2, 0);
???????

Assuming that the first assertion passes, which of the following outcomes are likely and which outcomes are possible for the second waitpid call?

Immediately returns r2 == p && status2 == status: The kernel delivers the exited process’s status again.
r2 == 0: No child process with ID p has exited.
r2 == p && WIFSTOPPED(status2): The zombie process has now stopped.
r2 == -1 && errno == ECHILD: No process named p is a child any more.
Eventually returns r2 == p, but possibly with a different status: After the first waitpid, another thread in the parent process created a new child process that coincidentally reused ID p.
None of the above.

QUESTION 8C. The following two code snippets intend to wait until either process p terminates, or a non-recoverable error occurs. Explain at a high level any differences between them, in terms of performance or functionality. Which version would you prefer? (Just a couple sentences, please.)

1.    pid_t r = 0;
      int status;
      while (r == 0 || (r == -1 && is_recoverable_error(errno))) {
          r = waitpid(p, &status, 0);
      }


2.    pid_t r = 0;
      int status;
      while (r == 0 || (r == -1 && is_recoverable_error(errno))) {
          r = waitpid(p, &status, WNOHANG);
      }

QUESTION 8D. Complete the following function, which waits until any of the processes in the pids array terminates or a non-recoverable error occurs.

You may assume that every element of pids is positive, and you may use the is_recoverable_error helper function from part B. You must not collect the termination status of any child process other than those in pids. (This is because a different function, or even a different thread, might be interested in other child processes’ termination statuses.) Use a polling, rather than blocking, approach.

// waitpile(pids, npids, status)
//    Waits until one of the processes with ids in `pids[0...npids-1]`
//    terminates or a non-recoverable error occurs. Returns the terminated
//    process’s ID (and stores its termination status in *status), or
//    returns -1 on error.

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    ...

2pt

while (true) {
    for (size_t i = 0; i != npids; ++i) {
        pid_t r = waitpid(pids[i], status, WNOHANG);
        if (r != 0 && (r != -1 || !is_recoverable_error(errno))) {
            return r;
        }
    }
}

QUESTION 8E. The waitpile function cannot be implemented in a blocking fashion unless you introduce an auxiliary data structure. Why not? List all that apply.

Only system calls can block, and waitpile is not a system call.
waitpid can block, but only for one specific process at a time.
waitpile must not collect the termination status for child processes not listed in pids, but waitpid(-1) can collect any child’s status.
None of the above.

QUESTION 8F. Bernadette Mayer needs to implement a blocking version of waitpile, so she designs an auxiliary data structure called a zombiepile. Any time her process waits for one or more children, it will use the zombiepile, which looks like this:

struct zombiepile {
    std::map<pid_t, int> status_map;

    // Adds `p` to the zombiepile with termination status `status`.
    void add(pid_t p, int status) {
        assert(!this->status_map.contains(p));
        this->status_map.insert({p, status});
    }

    // If `p` is contained in the zombiepile, set `*status` to `p`’s
    // termination status, remove `p` from the zombiepile, and return `p`.
    // Otherwise return 0.
    pid_t check(pid_t p, int* status) {
        auto it = this->status_map.find(p);
        if (it == this->status_map.end()) {
            return 0;
        }
        *status = it->second;
        this->status_map.erase(it);
        return p;
    }
};

extern zombiepile zp;   // single global zombiepile for the whole process

Write a blocking version of waitpile that uses zp. Your code will call zp.add and zp.check as well as waitpid. As before, waitpile should return as soon as any of the children in pids terminates. You may assume that waitpid never returns an error (it never returns –1), and you may assume that waitpile is called from a single-threaded process.

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    ...

3pt Assuming waitpid never returns an error:

while (true) {
    for (size_t i = 0; i != npids; ++i) {
        if (zp.check(pids[i], status)) {
            return pids[i];
        }
    }
    int xstatus;
    pid_t r = waitpid(-1, &xstatus, 0);
    if (r > 0) {
        zp.add(r, xstatus);
    }
}

Assuming it might return an error:

while (true) {
    for (size_t i = 0; i != npids; ++i) {
        if (zp.check(pids[i], status)) {
            return pids[i];
        }
    }
    for (size_t i = 0; i != npids; ++i) {
        pid_t r = waitpid(pids[i], status, WNOHANG);
        if (r == pids[i] || (r == -1 && !is_recoverable_error(errno))) {
            return r;
        }
    }
    int xstatus;
    pid_t r = waitpid(-1, &xstatus, 0);
    if (r > 0) {
        zp.add(r, xstatus);
    }
}

QUESTION 8G. In a multithreaded process, any thread can create a new child process, and any thread can wait for any child process. That means that zombiepile needs synchronization! Give an example of a problem that could occur if zombiepile lacked proper synchronization.

QUESTION 8H. Add proper synchronization to zombiepile and waitpile. As before, you may assume that waitpid never returns an error.

(Hint: There are several ways to solve this problem, including a polling solution, a blocking solution involving one or more auxiliary threads and/or signal handlers, and a blocking solution without any auxiliary thread. A polling solution will get partial credit. Make sure you avoid lost wakeups—this can be tricky if you don’t use an auxiliary thread.)

3pt Polling solution:

struct zombiepile {
    std::mutex m;
    ...

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    std::unique_lock guard(zp.m);
    while (true) {
        for (size_t i = 0; i != npids; ++i) {
            if (zp.check(pids[i], status)) {
                return pids[i];
            }
        }
        guard.unlock();
        int xstatus;
        pid_t r = waitpid(-1, &xstatus, WNOHANG);
        guard.lock();
        if (r > 0) {
            zp.add(r, xstatus);
        }
    }
}

Blocking solution with auxiliary thread—a detached thread that just calls waitpid and adds the termination statuses to the zombiepile:

struct zombiepile {
    std::mutex m;
    std::condition_variable_any cv;
    ...


static void auxthread() {
    // Let’s assume the `auxthread` is created on the first child process.
    // It might exit when there are no more child processes & then get
    // restarted—or we can just use the assumption that `waitpid` never fails
    while (true) {
        int xstatus;
        pid_t p = waitpid(-1, &xstatus, 0);
        if (p > 0) {
            std::unique_lock guard(zp.m);
            zp.add(p, xstatus);
            zp.cv.notify_all();
        }
    }
}

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    std::unique_lock guard(zp.m);
    while (true) {
        for (size_t i = 0; i != npids; ++i) {
            if (zp.check(pids[i], status)) {
                return pids[i];
            }
        }
        zp.cv.wait(guard);
    }
}

Blocking solution without auxiliary thread—note the care required to ensure that exactly one thread calls waitpid at a time (because if more than one thread called waitpid, one of them might experience a lost wakeup):

struct zombiepile {
    std::mutex m;
    std::condition_variable_any cv;
    bool has_waiter;
...

pid_t waitpile(const pid_t* pids, size_t npids, int* status) {
    std::unique_lock guard(zp.m);
    while (true) {
        for (size_t i = 0; i != npids; ++i) {
            if (zp.check(pids[i], status)) {
                return pids[i];
            }
        }
        for (size_t i = 0; i != npids; ++i) { // only req'd for error checking
            pid_t r = waitpid(pids[i], status, WNOHANG);
            if (r == pids[i] || (r == -1 && !is_recoverable_error(errno))) {
                return r;
            }
        }
        if (!zp.has_waiter) {
            zp.has_waiter = true;
            guard.unlock();
            int xstatus;
            pid_t r = waitpid(-1, &xstatus, 0);
            guard.lock();
            if (r > 0) {
                zp.add(r, xstatus);
                zp.cv.notify_all();
            }
            zp.has_waiter = false;
        } else {
            zp.cv.wait(guard);
        }
    }
}

9. The end (5 points)

QUESTION 9A. What improvements would you suggest to the class? Any answer but no answer will receive full credit.

QUESTION 9B. Which topics do you wish we had spent more time on? Any answer but no answer will receive full credit.

QUESTION 9C (0 points). Assuming you weren’t at the final review session, did you know that we have a TF named Jeremy?

2022 Final