CS 61 2016

SH-2. Process management

Here’s the skeleton of a shell function implementing a simple two-command pipeline, such as “cmd1 | cmd2”.

void simple_pipe(const char* cmd1, char* const* argv1, const char* cmd2, char* const* argv2) {
    int pipefd[2], r, status;

[A]

    pid_t child1 = fork();
    if (child1 == 0) {

[B]

        execvp(cmd1, argv1);
    }
    assert(child1 > 0);

[C]

    pid_t child2 = fork();
    if (child2 == 0) {

[D]

        execvp(cmd2, argv2);
    }
    assert(child2 > 0);

[E]

And here is a grab bag of system calls.

[1] close(pipefd[0]);
[2] close(pipefd[1]);
[3] dup2(pipefd[0], STDIN_FILENO);
[4] dup2(pipefd[0], STDOUT_FILENO);
[5] dup2(pipefd[1], STDIN_FILENO);
[6] dup2(pipefd[1], STDOUT_FILENO);
[7] pipe(pipefd);
[8] r = waitpid(child1, &status, 0);
[9] r = waitpid(child2, &status, 0);

Your task is to assign system call IDs, such as “1”, to slots, such as “A”, to achieve several behaviors, including a correct pipeline and several incorrect pipelines. For each question:

You may use each system call ID once, more than once, or not at all.
You may use zero or more system call IDs per slot. Write them in the order they should appear in the code.
You may assume that no signals are delivered to the shell process (so no system call ever returns an EINTR error).
The function should wait for both commands in the pipeline to complete before returning.
It may help to detach the last “Reference material” page of the exam.

QUESTION SH-2A. Implement a correct foreground pipeline.

`A`	`B (child1)`	`C`	`D (child2)`	`E`
7	6, 1, 2 or 6, 2, 1 or 1, 6, 2		3, 1, 2 or 3, 2, 1 or 2, 3, 1	1, 2, 8, 9 or 2, 1, 9, 8, etc.
OR
7	6, 1, 2 or 6, 2, 1 or 1, 6, 2	2	3, 1	1, 8, 9 or 1, 9, 8

QUESTION SH-2B. Implement a pipeline so that, given arguments corresponding to “echo foo | wc -c”, the wc process reads “foo” from its standard input but does not exit thereafter. For partial credit describe in words how this might happen.

Anything that doesn’t close the pipe’s write end will do it. Below we leave both ends of the pipe open in the shell. We could also enter just “3” in slot D.

`A`	`B (child1)`	`C`	`D (child2)`	`E`
7	6, 1, 2		3, 1, 2	8, 9

QUESTION SH-2C. Implement a pipeline so that, given arguments corresponding to “echo foo | wc -c”, “foo” is printed to the shell’s standard output and the wc process prints “0”. (In a correctly implemented pipeline, “wc” would print 4, which is the number of characters in “foo\n”.) For partial credit describe in words how this might happen.

Anything that doesn’t redirect the left-hand side’s standard output will do it. It is important that the read end of the pipe be properly redirected, or wc would block reading from the shell’s standard input.

`A`	`B (child1)`	`C`	`D (child2)`	`E`
7	1, 2 (anything without 6)		3, 1, 2	1, 2, 8, 9

QUESTION SH-2D. Implement a pipeline that appears to work correctly on “echo foo | wc -c”, but always blocks forever if the left-hand command outputs more than 65536 characters. For partial credit describe in words how this might happen.

This happens when we execute the two sides of the pipe in series: first the left-hand side, then the right-hand side. Since the pipe contains 64KiB of buffering, this will often appear to work.

`A`	`B (child1)`	`C`	`D (child2)`	`E`
7	6, 1, 2	8	3, 1, 2	1, 2, 9

QUESTION SH-2E. Implement a pipeline so that, given arguments corresponding to “echo foo | wc -c”, both echo and wc report a “Bad file descriptor” error. (This error, which corresponds to EBADF, is returned when a file descriptor is not valid or does not support the requested operation.) For partial credit describe in words how this might happen.

Given these system calls, the only way to make this happen is to redirect the wrong ends of the pipe into stdin/stdout.

`A`	`B (child1)`	`C`	`D (child2)`	`E`
7	4, 1, 2		5, 1, 2	1, 2, 8, 9

NET-1. Networking

QUESTION NET-1A. Which of the following system calls should a programmer expect to sometimes block (i.e., to return after significant delay)? Circle all that apply.


1. `socket`		5. `connect`
2. `read`		6. `write`
3. `accept`		7. `usleep`
4. `listen`		8. None of these

#2 read, #3 accept, #5 connect, (#6 write), #7 usleep. (write can definitely block if the reading end hasn’t caught up, but we didn’t emphasize this in class, so no points off.)

QUESTION NET-1B. Below are seven message sequence diagrams demonstrating the operation of a client–server RPC protocol. A request such as “get(X)” means “fetch the value of the object named X”; the response contains that value. Match each network property or programming strategy below with the diagram with which it best corresponds. You will use every diagram once.


1. Loss	4. Duplication	7. Exponential backoff
2. Delay	5. Batching
3. Reordering	6. Prefetching

#1—B, #2—C, #3—F, #4—D, #5—G, #6—A, #7—E (A—#6, B—#1, C—#2, D—#4, E—#7, F—#3, G—#5)

While G could also represent prefetching, A definitely does not represent batching at the RPC level—each RPC contains one request—so under the rule that each diagram is used once, we must say G is batching and A is prefetching.

Finalfig2012_1.gif	Finalfig2012_2.gif	Finalfig2012_3.gif	Finalfig2012_4.gif
A	B	C	D

Finalfig2012_5.gif	Finalfig2012_6.gif	Finalfig2012_7.gif
E	F	G

NET-2. Making Network Servers Robust

QUESTION NET-2A. You've built a network server, list the resources that you might run out of if someone launched a DoS attack on you.

At least: file descriptors, memory (stack), processes. There’re a lot of correct answers, though! You can run out of virtual memory or even physical memory.

QUESTION NET-2B. Sam suggests that you just create a separate thread to handle each incoming connection. Why isn't this necessarily going to work?

Each thread requires a stack and it's easy to run out of space for those stacks. Threads also occupy other resources—in Linux, each thread even has a PID!

QUESTION NET-2C. A server sets up a socket to listen on a connection. When a client wants to establish a connection, how does the server manage the multiple clients? In your answer indicate what system call or calls are used and what they do.

You call accept, which creates a new fd that is particular to the connection with a particular client. That is, the server uses a different fd for each client.

QUESTION NET-2D. Which of the following system calls might block?

accept
bind
connect
listen
setsockopt
select
socket

accept, connect, select

SYNCH-3. Pipes and synchronization

In the following questions, you will implement a mutex using a pipe, and a limited type of pipe using a mutex.

The definitions of the pthread mutex and condition variable operations are as follows.

int pthread_mutex_init(pthread_mutex_t* m, const pthread_mutexattr_t* attr) Create a new mutex with attributes defined by attr. (For this question, attr is ignored.)

int pthread_mutex_lock(pthread_mutex_t* m) Locks m. If the mutex is already locked, the calling thread will block until the mutex becomes available.

int pthread_mutex_unlock(pthread_mutex_t* m) Unlocks m. Calling pthread_mutex_unlock with a mutex that the calling thread does not hold will result in undefined behavior.

int pthread_cond_init(pthread_cond_t* c, const pthread_condattr_t* attr) Create a new condition variable with attributes defined by attr. (For this question, attr is ignored.)

int pthread_cond_signal(pthread_cond_t* c) Unblocks one thread waiting for c.

int pthread_cond_wait(pthread_cond_t* c, pthread_mutex_t* m) Atomically unlocks m and blocks the calling thread on the condition c. When the condition is signaled, the thread locks m and returns. Calling pthread_cond_wait with an unlocked mutex will result in undefined behavior.

The operations return 0 on success. Although errors are possible (for instance, ENOMEM if there’s not enough memory to allocate a new mutex) you may assume that they don’t occur.

QUESTION SYNCH-3A. In this question, you are to implement mutex functionality using a pipe. Fill in the definitions of pipe_mutex_init, pipe_mutex_lock, and pipe_mutex_unlock. You should be able to implement the same functionality as the pthread versions (assuming no other code accesses the pipe).

typedef struct pipe_mutex {
    int fd[2];
} pipe_mutex;

int pipe_mutex_init(pipe_mutex* m) {     
    if (pipe(&m->fd) < 0)
        return -1;

    write(m->fd[1], "X", 1);

    return 0;
}

int pipe_mutex_lock(pipe_mutex* m) {

    char buf;
    read(m->fd[0], &buf, 1);

A while loop would be in some ways even better, but you were allowed to assume no rando error returns.

int pipe_mutex_unlock(pipe_mutex* m) {

    write(m->fd[1], "X", 1);

In the next questions, you will help implement pipe functionality using an in-memory buffer and a mutex. This “mutex pipe” will only work between threads of the same process (in contrast to a regular pipe, which also works between processes). An initial implementation of mutex pipes is as follows; you will note that it contains no mutexes.

    typedef struct mutex_pipe {

1. char buf[BUFSIZ]; 2. size_t head; 3. size_t sz;

    } mutex_pipe;

    int mutex_pipe_init(mutex_pipe* p) {

6. p->head = p->sz = 0; 7. memset(&p->buf[0], 0, sizeof(p->buf)); 8. return 0;

`     // Read up to `sz` bytes from the mutex_pipe into `buf` and return the number of bytes `
    // read. If no bytes are available, wait until at least one byte can be read.
    ssize_t mutex_pipe_read(mutex_pipe* p, char* buf, size_t sz) {

10. size_t n = 0; 11. while (n < sz && (p->sz != 0 || n == 0)) { 12. size_t ncopy = p->sz; 13. if (ncopy > sizeof(p->buf) - p->head) 14. ncopy = sizeof(p->buf) - p->head; 15. if (ncopy > sz - n) 16. ncopy = sz - n; 17. memcpy(&buf[n], &p->buf[p->head], ncopy); 18. n += ncopy; 19. p->head += ncopy; 20. p->head = p->head % sizeof(p->buf); 21. p->sz -= ncopy; 22. } 23. return n;

`     // Write up to `sz` bytes from `buf` into the mutex_pipe and return the number of bytes `
    // written. If no space is available, wait until at least one byte can be written.
    ssize_t mutex_pipe_write(mutex_pipe* p, const char* buf, size_t sz) {

30. size_t n = 0; 31. while (n < sz && (p->sz != sizeof(p->buf) || n == 0)) { 32. size_t tail = p->head + p->sz; 33. tail = tail % sizeof(p->buf); 34. size_t ncopy = sizeof(p->buf) - p->sz; 35. if (ncopy > sizeof(p->buf) - tail) 36. ncopy = sizeof(p->buf) - tail; 37. if (ncopy > sz - n) 38. ncopy = sz - n; 39. memcpy(&p->buf[tail], &buf[n], ncopy); 40. n += ncopy; 41. p->sz += ncopy; 42. } 43. return n;

The last page of this exam has a copy of that code that you can remove and keep.

NOT A QUESTION. It would be wise to work through an example. For example, assume BUFSIZ == 4, and figure out how the following calls would behave.

mutex_pipe_write(p, "Hi", 2);
mutex_pipe_read(p, buf, 4);
mutex_pipe_write(p, "Test", 4);
mutex_pipe_read(p, buf, 3);

First let’s reason about this code in the absence of threads.

QUESTION SYNCH-3B. Which of the following changes could, if made in isolation, result in undefined behavior when a mutex pipe was used? Circle all that apply.

Eliminating line 6
Eliminating line 7
Eliminating lines 13–14
Eliminating lines 15–16
Eliminating line 18
Eliminating line 19

6, 13–14, and 15–16.

6: Accesses to uninitialized variables cause undefined behavior. 13–14: Could cause accesses off the end of p->buf. 15–16: Could cause accesses off the end of buf.

7: If this is the only change, no problem; the existing code never accesses bytes that were not written by mutex_pipe_write. 18: This causes mutex_pipe_read to spin forever (since n is not increased), but that’s not undefined. 19: This causes mutex_pipe_read to read the same data over and over again (since p->head never advances), but that’s not undefined.

QUESTION SYNCH-3C. Which of the following changes could, if made in isolation, cause a mutex_pipe_read to return incorrect data (that is, the byte sequence produced by read will not equal the byte sequence passed to write)? Circle all that apply.

Eliminating line 33
Eliminating lines 35–36
Eliminating lines 37–38
Eliminating line 39
Eliminating line 40
Eliminating line 41

35–36, 37–38, and 39.

35–36: Copies some of the written data past the end of the buffer. Not only does this cause undefined behavior, the data’s lost to the reader. 37–38: Copies some unwritten data (data past the end of the write buffer) into the pipe. 39: Take this away and nothing gets written to the buffer! But the size still grows so the reader thinks there’s data there. 33: This is a tricky one. Here’s what happens: No mutex_pipe_write can write data that “wraps around” the buffer. Assume BUFSIZ == 4, p->head == 3, p->sz == 1, and mutex_pipe_write(p, "X", 1) is called. Basically tail is set to 4, and lines 35–36 will set tail = 0! So then mutex_pipe_write will spin forever; but in the meantime mutex_pipe_read will not read anything incorrect. 40: n never advances, so the same portion of the data (the first portion) is written over and over again into the pipe until it fills up. This leaves the pipe in a bad state—containing data that was never written in that order—but then mutex_pipe_write spins forever, so the reader can never read it. 41: Doesn’t advance sz: the buffer always appears empty, so the reader never observes incorrect data.

It should be considered OK to select #40.

QUESTION SYNCH-3D. Which of the following changes could, if made in isolation, cause a call to mutex_pipe_write to never return (when a correct implementation would return)? Circle all that apply.

Eliminating line 33
Eliminating lines 35–36
Eliminating lines 37–38
Eliminating line 39
Eliminating line 40
Eliminating line 41

33, 35–36, 37–38, and 40.

33 and 40: covered above. 35–36 and 37–38: undefined behavior can cause anything to happen! Maybe no points off for this, though.

QUESTION SYNCH-3E. Write an invariant for p->sz. An invariant is a statement about the value of p->sz that is always true. Write your invariant in the form of an assertion; for full credit give the most specific true invariant you can. (“p->sz is an integer” is unspecific, but true; “p->sz == 4” is specific, but false.)

assert(  p->sz >= 0 && p->sz <= BUFSIZ  );

But in fact p->sz is a size_t, so p->sz >= 0 is a tautology and assert(p->sz <= BUFSIZ) works too.

QUESTION SYNCH-3F. Write an invariant for p->head. For full credit give the most specific true invariant you can.

2 points.

assert(  p->head >= 0 && p->head < BUFSIZ  );

Again, assert(p->head < BUFSIZ) is equivalent.

In the remaining questions, you will add synchronization objects and operations to make your mutex pipe work in a multithreaded program. Here is your starting point:

    typedef struct mutex_pipe {

1. char buf[BUFSIZ]; 2. size_t head; 3. size_t sz; 4. pthread_mutex_t m;

    } mutex_pipe;

    int mutex_pipe_init(mutex_pipe* p) {

5. pthread_mutex_init(&p->m, NULL); 6. p->head = p->sz = 0; 7. memset(&p->buf[0], 0, sizeof(p->buf)); 8. return 0;

    (the rest of the code as in the prior questions)

QUESTION SYNCH-3G. Add calls to “lock” (pthread_mutex_lock) and “unlock” (pthread_mutex_unlock) that protect the mutex pipe from race condition bugs. Write one or more snippets of C code and give line numbers after which the snippets should appear. For full credit, your solution must not deadlock—if one thread is reading from a pipe and another thread is writing to the pipe, then both threads must eventually make progress.

Add pthread_mutex_lock(&p->m); after lines: 10, 30
Add pthread_mutex_unlock(&p->m); after lines: 22, 42
Other changes (if any):

After #17 & #39:

if (n == 0) {
    pthread_mutex_unlock(&p->m);   // or just "unlock"
    sched_yield();   // optional
    pthread_mutex_lock(&p->m);     // or just "lock"
}

Some people might put a sched_yield() in: nice!

QUESTION SYNCH-3H. Your solution to the last question has poor utilization. For instance, a thread that calls mutex_pipe_read on an empty mutex pipe will spin forever, rather than block. Introduce one or more condition variables so that mutex_pipe_read will block until data is available. Write one or more snippets of C code and give line numbers after which the snippets should appear.

Add to struct mutex_pipe: pthread_cond_t c;
Add to mutex_pipe_init after line 7: pthread_cond_init(&c, NULL);
Other changes:

After #17, instead of the code above:

if (n == 0)
    pthread_cond_wait(&p->c, &p->m);

After #39:

if (n != 0)
    pthread_cond_signal(&p->c);

(This can go anywhere after n is calculated.)

KERN-4. Teensy OS VM System

The folks at Teensy Computers, Inc, need your help with their VM system. The hardware team that developed the VM system abruptly left and the folks remaining aren't quite sure how VM works. I volunteered you to help them.

The Teensy machine has a 16-bit virtual address space with 4 KB pages. The Teensy hardware specifies a single-level page table. Each entry in the page table is 16-bits. Eight of those bits are reserved for the physical page number and 8 of the bits are reserved for flag values. Sadly, the hardware designers did not document what the bits do!

QUESTION KERN-4A. How many pages are in the Teensy virtual address space?

16 (2⁴)

QUESTION KERN-4B. How many bits comprise a physical address?

20 (8 bits of physical page number + 12 bits of page offset)

QUESTION KERN-4C. Is the physical address space larger or smaller than the virtual address space?

LARGER (20 bits versus 16 bits)

QUESTION KERN-4D. Write, in hex, a PAGE_OFFSET_MASK (the value that when anded with an address returns the offset of the address on a page).

0xFFF

QUESTION KERN-4E. Write a C expression that takes a virtual address, in the variable vaddr, and returns the virtual page number.

(vaddr \>\> 12)` OR `((vaddr) & 0xF000 \>\> 12)

You are now going to work with the Teensy engineers to figure out what those other bits in the page table entries mean! Fortunately, they have some engineering notes from the hardware team—they need your help in making sense of them. Each letter below has the contents of a note, state what you can conclude from that note about the lower 8 bits of the page table entries.

QUESTION KERN-4F. "Robin, I ran 8 tests using a kernel that did nothing other than loop infinitely -- for each test I set a different bit in all the PTEs of the page table. All of them ended up in the exception handler except for the one where I set bit 4. Any idea what this means?"

bit 4 is the "present or valid bit"

QUESTION KERN-4G. "Lynn, I'm writing a memory test that iterates over all of memory making sure that I can read back the same pattern I write into memory. If I don't set bit 7 of the page table entries to 1, I get permission faults. Do you know what might be happening?"

bit 7 is the "write" bit

QUESTION KERN-4H. "Pat, I almost have user level processes running! It seems that the user processes take permission faults unless I have both bit 4 and bit 3 set. Do you know why?"

bit 3 is the "user/unprivileged bit"

MISC-1. Debugging

In the following short-answer questions, you have access to five debugging tools: top, strace, gdb, valgrind, and man. You can’t change program source code or use other tools. Answer the questions briefly (a couple sentences at most).

QUESTION MISC-1A. You are given a program that appears to “get stuck” when run. How would you distinguish whether the program blocked forever (e.g., made a system call that never returned) or entered an infinite loop?

You can use top: does it report the process is using 100% of the CPU?

You can use strace: is the last thing in the strace an incomplete system call?

QUESTION MISC-1B. You are given a program that uses a lot of memory. How would you tell whether the program leaks memory?

Use valgrind and check if it reports any memory leaks.

QUESTION MISC-1C. You are given a program that produces weird answers. How would you check if it invoked undefined behavior?

Use valgrind and check if it reports undefined behavior. GDB is also acceptable here.

QUESTION MISC-1D. You are given a program that blocks forever. How would you tell where the program blocked (which function called the blocking system call)?

Run it under gdb. When it blocks, hit Ctrl-C and then enter backtrace/bt to get a backtrace.

QUESTION MISC-1E. You are given a program that takes a long time to produce a result. How would you tell whether the program was using system calls unintelligently?

Run it under strace and look for stupidity, such as many system calls that report errors, many system calls that are redundant, lots of reads that return short counts, etc.

QUESTION MISC-1F. You are given a program that exits with a system call error, but doesn’t explain what happened in detail. How would you find what error condition occurred and understand the conditions that could cause that error?

Run it under strace to find the error condition: look for a system call that returned the error. Then use man on that system call and read about the error (the errno description).

MISC-2. More Miscellany

QUESTION MISC-2A. True or false in conventional Unix systems?

File descriptors are often used to communicate among processes on the same machine. True
File descriptors are often used to communicate among processes on different machines. True
File descriptors are often used to communicate with persistent storage. True
File descriptors are often used to access primary memory. False
File descriptors are often used to create child processes. False

QUESTION MISC-2B. Match the process isolation feature on the left with the hardware feature that helps enforce it on the right. Use each hardware feature once (make the best match you can).

Protected control transfer (processes can transfer control to the kernel only at defined entry points)
Memory protection (one process cannot modify another process’s memory)
Interrupt protection (only the kernel can disable interrupts)
CPU protection (the kernel always regains control of the CPU eventually)

Traps
Privileged mode (dangerous instructions fault unless the CPU is in privileged mode)
Timer interrupts
Page tables

1—a, 2—d, 3—b, 4—c

The remaining questions refer to the following lines of code.

1. close(fd); 2. connect(fd, sockaddr, socklen); 3. listen(fd); 4. mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0); 5. read(fd, buf, 4096); 6. write(fd, buf, 4096);

QUESTION MISC-2C. If a program executes the following line without error, which lines could be executed next without error? List all numbers that apply.

fd = open("/home/cs61user/cs61-psets/pset6/pong61.c", O_RDWR);

1, 4, 5, 6

QUESTION MISC-2D. If a program executes the following line without error, which lines could be executed next without error? List all numbers that apply.

fd = socket(AF_INET, SOCK_STREAM, 0);

1, 2, 3

QUESTION MISC-2E. If a program executes the following lines without error, which lines could be executed next without error? List all numbers that apply.

pipe(pipefd); fd = pipefd[0];

1, 5