SH-2. Process management
Here’s the skeleton of a shell function implementing a simple
two-command pipeline, such as “cmd1 | cmd2
”.
void simple_pipe(const char* cmd1, char* const* argv1, const char* cmd2, char* const* argv2) {
int pipefd[2], r, status;
[A]
pid_t child1 = fork();
if (child1 == 0) {
[B]
execvp(cmd1, argv1);
}
assert(child1 > 0);
[C]
pid_t child2 = fork();
if (child2 == 0) {
[D]
execvp(cmd2, argv2);
}
assert(child2 > 0);
[E]
}
And here is a grab bag of system calls.
[1] close(pipefd[0]);
[2] close(pipefd[1]);
[3] dup2(pipefd[0], STDIN_FILENO);
[4] dup2(pipefd[0], STDOUT_FILENO);
[5] dup2(pipefd[1], STDIN_FILENO);
[6] dup2(pipefd[1], STDOUT_FILENO);
[7] pipe(pipefd);
[8] r = waitpid(child1, &status, 0);
[9] r = waitpid(child2, &status, 0);
Your task is to assign system call IDs, such as “1
”, to slots, such as
“A
”, to achieve several behaviors, including a correct pipeline and
several incorrect pipelines. For each question:
- You may use each system call ID once, more than once, or not at all.
- You may use zero or more system call IDs per slot. Write them in the order they should appear in the code.
- You may assume that no signals are delivered to the shell process (so
no system call ever returns an
EINTR
error). - The function should wait for both commands in the pipeline to complete before returning.
- It may help to detach the last “Reference material” page of the exam.
QUESTION SH-2A. Implement a correct foreground pipeline.
|
|
|
|
|
---|---|---|---|---|
7 |
6, 1,
2 |
3, 1,
2 |
1, 2, 8, 9 |
|
OR |
||||
7 |
6, 1,
2 |
2 |
3, 1 |
1, 8, 9 |
QUESTION SH-2B. Implement a pipeline so that, given arguments
corresponding to “echo foo | wc -c
”, the wc
process reads “foo
”
from its standard input but does not exit thereafter. For partial
credit describe in words how this might happen.
Anything that doesn’t close the pipe’s write end will do it. Below we leave both ends of the pipe open in the shell. We could also enter just “3” in slot D.
A |
B (child1) |
C |
D (child2) |
E |
---|---|---|---|---|
7 | 6, 1, 2 | 3, 1, 2 | 8, 9 |
QUESTION SH-2C. Implement a pipeline so that, given arguments
corresponding to “echo foo | wc -c
”, “foo
” is printed to the
shell’s standard output and the wc
process prints “0
”. (In a
correctly implemented pipeline, “wc
” would print 4
, which is the
number of characters in “foo\n
”.) For partial credit describe in
words how this might happen.
Anything that doesn’t redirect the left-hand side’s standard output will
do it. It is important that the read end of the pipe be properly
redirected, or wc
would block reading from the shell’s standard
input.
|
|
|
|
|
---|---|---|---|---|
7 |
1, 2 |
3, 1, 2 |
1, 2, 8, 9 |
QUESTION SH-2D. Implement a pipeline that appears to work correctly
on “echo foo | wc -c
”, but always blocks forever if the left-hand
command outputs more than 65536 characters. For partial credit
describe in words how this might happen.
This happens when we execute the two sides of the pipe in series: first the left-hand side, then the right-hand side. Since the pipe contains 64KiB of buffering, this will often appear to work.
A |
B (child1) |
C |
D (child2) |
E |
---|---|---|---|---|
7 | 6, 1, 2 | 8 | 3, 1, 2 | 1, 2, 9 |
QUESTION SH-2E. Implement a pipeline so that, given arguments
corresponding to “echo foo | wc -c
”, both echo
and wc
report a
“Bad file descriptor” error. (This error, which corresponds to EBADF
,
is returned when a file descriptor is not valid or does not support the
requested operation.) For partial credit describe in words how this
might happen.
Given these system calls, the only way to make this happen is to redirect the wrong ends of the pipe into stdin/stdout.
A |
B (child1) |
C |
D (child2) |
E |
---|---|---|---|---|
7 | 4, 1, 2 | 5, 1, 2 | 1, 2, 8, 9 |
NET-1. Networking
QUESTION NET-1A. Which of the following system calls should a programmer expect to sometimes block (i.e., to return after significant delay)? Circle all that apply.
1. socket |
5. connect |
|
2. read |
6. write |
|
3. accept |
7. usleep |
|
4. listen |
8. None of these |
#2 read
, #3 accept
, #5 connect
, (#6 write
), #7 usleep
.
(write
can definitely block if the reading end hasn’t caught up, but
we didn’t emphasize this in class, so no points off.)
QUESTION NET-1B. Below are seven message sequence diagrams demonstrating the operation of a client–server RPC protocol. A request such as “get(X)” means “fetch the value of the object named X”; the response contains that value. Match each network property or programming strategy below with the diagram with which it best corresponds. You will use every diagram once.
1. Loss | 4. Duplication | 7. Exponential backoff | ||
2. Delay | 5. Batching | |||
3. Reordering | 6. Prefetching |
#1—B, #2—C, #3—F, #4—D, #5—G, #6—A, #7—E (A—#6, B—#1, C—#2, D—#4, E—#7, F—#3, G—#5)
While G could also represent prefetching, A definitely does not represent batching at the RPC level—each RPC contains one request—so under the rule that each diagram is used once, we must say G is batching and A is prefetching.
A |
B |
C |
D |
E |
F |
G |
NET-2. Making Network Servers Robust
QUESTION NET-2A. You've built a network server, list the resources that you might run out of if someone launched a DoS attack on you.
At least: file descriptors, memory (stack), processes. There’re a lot of correct answers, though! You can run out of virtual memory or even physical memory.
QUESTION NET-2B. Sam suggests that you just create a separate thread to handle each incoming connection. Why isn't this necessarily going to work?
Each thread requires a stack and it's easy to run out of space for those stacks. Threads also occupy other resources—in Linux, each thread even has a PID!
QUESTION NET-2C. A server sets up a socket to listen on a connection. When a client wants to establish a connection, how does the server manage the multiple clients? In your answer indicate what system call or calls are used and what they do.
You call accept, which creates a new fd that is particular to the connection with a particular client. That is, the server uses a different fd for each client.
QUESTION NET-2D. Which of the following system calls might block?
- accept
- bind
- connect
- listen
- setsockopt
- select
- socket
accept, connect, select
SYNCH-3. Pipes and synchronization
In the following questions, you will implement a mutex using a pipe, and a limited type of pipe using a mutex.
The definitions of the pthread mutex and condition variable operations are as follows.
int pthread_mutex_init(pthread_mutex_t* m, const pthread_mutexattr_t* attr)
Create a new mutex with attributes defined by attr
. (For this
question, attr
is ignored.)
int pthread_mutex_lock(pthread_mutex_t* m)
Locks m
. If the mutex is already locked, the calling thread will block
until the mutex becomes available.
int pthread_mutex_unlock(pthread_mutex_t* m)
Unlocks m
. Calling pthread_mutex_unlock
with a mutex that the
calling thread does not hold will result in undefined behavior.
int pthread_cond_init(pthread_cond_t* c, const pthread_condattr_t* attr)
Create a new condition variable with attributes defined by attr
. (For
this question, attr
is ignored.)
int pthread_cond_signal(pthread_cond_t* c)
Unblocks one thread waiting for c
.
int pthread_cond_wait(pthread_cond_t* c, pthread_mutex_t* m)
Atomically unlocks m
and blocks the calling thread on the condition
c
. When the condition is signaled, the thread locks m
and returns.
Calling pthread_cond_wait
with an unlocked mutex will result in
undefined behavior.
The operations return 0 on success. Although errors are possible (for
instance, ENOMEM
if there’s not enough memory to allocate a new mutex)
you may assume that they don’t occur.
QUESTION SYNCH-3A. In this question, you are to implement mutex
functionality using a pipe. Fill in the definitions of
pipe_mutex_init
, pipe_mutex_lock
, and pipe_mutex_unlock
. You
should be able to implement the same functionality as the pthread
versions (assuming no other code accesses the pipe).
typedef struct pipe_mutex {
int fd[2];
} pipe_mutex;
int pipe_mutex_init(pipe_mutex* m) {
if (pipe(&m->fd) < 0)
return -1;
write(m->fd[1], "X", 1);
return 0;
}
int pipe_mutex_lock(pipe_mutex* m) {
char buf;
read(m->fd[0], &buf, 1);
A while loop would be in some ways even better, but you were allowed to assume no rando error returns.
}
int pipe_mutex_unlock(pipe_mutex* m) {
write(m->fd[1], "X", 1);
}
In the next questions, you will help implement pipe functionality using an in-memory buffer and a mutex. This “mutex pipe” will only work between threads of the same process (in contrast to a regular pipe, which also works between processes). An initial implementation of mutex pipes is as follows; you will note that it contains no mutexes.
typedef struct mutex_pipe {
1.
char buf[BUFSIZ];
2.
size_t head;
3.
size_t sz;
} mutex_pipe;
int mutex_pipe_init(mutex_pipe* p) {
6.
p->head = p->sz = 0;
7.
memset(&p->buf[0], 0, sizeof(p->buf));
8.
return 0;
}
` // Read up to `sz` bytes from the mutex_pipe into `buf` and return the number of bytes `
// read. If no bytes are available, wait until at least one byte can be read.
ssize_t mutex_pipe_read(mutex_pipe* p, char* buf, size_t sz) {
10.
size_t n = 0;
11.
while (n < sz && (p->sz != 0 || n == 0)) {
12.
size_t ncopy = p->sz;
13.
if (ncopy > sizeof(p->buf) - p->head)
14.
ncopy = sizeof(p->buf) - p->head;
15.
if (ncopy > sz - n)
16.
ncopy = sz - n;
17.
memcpy(&buf[n], &p->buf[p->head], ncopy);
18.
n += ncopy;
19.
p->head += ncopy;
20.
p->head = p->head % sizeof(p->buf);
21.
p->sz -= ncopy;
22.
}
23.
return n;
}
` // Write up to `sz` bytes from `buf` into the mutex_pipe and return the number of bytes `
// written. If no space is available, wait until at least one byte can be written.
ssize_t mutex_pipe_write(mutex_pipe* p, const char* buf, size_t sz) {
30.
size_t n = 0;
31.
while (n < sz && (p->sz != sizeof(p->buf) || n == 0)) {
32.
size_t tail = p->head + p->sz;
33.
tail = tail % sizeof(p->buf);
34.
size_t ncopy = sizeof(p->buf) - p->sz;
35.
if (ncopy > sizeof(p->buf) - tail)
36.
ncopy = sizeof(p->buf) - tail;
37.
if (ncopy > sz - n)
38.
ncopy = sz - n;
39.
memcpy(&p->buf[tail], &buf[n], ncopy);
40.
n += ncopy;
41.
p->sz += ncopy;
42.
}
43.
return n;
}
The last page of this exam has a copy of that code that you can remove and keep.
NOT A QUESTION.
It would be wise to work through an example. For example, assume
BUFSIZ == 4
, and figure out how the following calls would behave.
mutex_pipe_write(p, "Hi", 2);
mutex_pipe_read(p, buf, 4);
mutex_pipe_write(p, "Test", 4);
mutex_pipe_read(p, buf, 3);
First let’s reason about this code in the absence of threads.
QUESTION SYNCH-3B. Which of the following changes could, if made in isolation, result in undefined behavior when a mutex pipe was used? Circle all that apply.
- Eliminating line 6
- Eliminating line 7
- Eliminating lines 13–14
- Eliminating lines 15–16
- Eliminating line 18
- Eliminating line 19
6, 13–14, and 15–16.
6: Accesses to uninitialized variables cause undefined behavior.
13–14: Could cause accesses off the end of p->buf
.
15–16: Could cause accesses off the end of buf
.
7: If this is the only change, no problem; the existing code never
accesses bytes that were not written by mutex_pipe_write
.
18: This causes mutex_pipe_read
to spin forever (since n
is not
increased), but that’s not undefined.
19: This causes mutex_pipe_read
to read the same data over and over
again (since p->head
never advances), but that’s not undefined.
QUESTION SYNCH-3C. Which of the following changes could, if made in
isolation, cause a mutex_pipe_read
to return incorrect data (that is,
the byte sequence produced by read
will not equal the byte sequence
passed to write
)? Circle all that apply.
- Eliminating line 33
- Eliminating lines 35–36
- Eliminating lines 37–38
- Eliminating line 39
- Eliminating line 40
- Eliminating line 41
35–36, 37–38, and 39.
35–36: Copies some of the written data past the end of the buffer. Not
only does this cause undefined behavior, the data’s lost to the
reader.
37–38: Copies some unwritten data (data past the end of the write
buffer) into the pipe.
39: Take this away and nothing gets written to the buffer! But the size
still grows so the reader thinks there’s data there.
33: This is a tricky one. Here’s what happens: No mutex_pipe_write
can
write data that “wraps around” the buffer. Assume BUFSIZ == 4
,
p->head == 3
, p->sz == 1
, and mutex_pipe_write(p, "X", 1)
is
called. Basically tail
is set to 4
, and lines 35–36 will set
tail = 0
! So then mutex_pipe_write
will spin forever; but in the
meantime mutex_pipe_read
will not read anything incorrect.
40: n
never advances, so the same portion of the data (the first
portion) is written over and over again into the pipe until it fills up.
This leaves the pipe in a bad state—containing data that was never
written in that order—but then mutex_pipe_write
spins forever, so
the reader can never read it.
41: Doesn’t advance sz
: the buffer always appears empty, so the reader
never observes incorrect data.
It should be considered OK to select #40.
QUESTION SYNCH-3D. Which of the following changes could, if made in
isolation, cause a call to mutex_pipe_write
to never return (when a
correct implementation would return)? Circle all that apply.
- Eliminating line 33
- Eliminating lines 35–36
- Eliminating lines 37–38
- Eliminating line 39
- Eliminating line 40
- Eliminating line 41
33, 35–36, 37–38, and 40.
33 and 40: covered above. 35–36 and 37–38: undefined behavior can cause anything to happen! Maybe no points off for this, though.
QUESTION SYNCH-3E. Write an invariant for p->sz
. An invariant is a
statement about the value of p->sz
that is always true. Write your
invariant in the form of an assertion; for full credit give the most
specific true invariant you can. (“p->sz
is an integer” is unspecific,
but true; “p->sz == 4
” is specific, but false.)
assert( p->sz >= 0 && p->sz <= BUFSIZ );
But in fact p->sz
is a size_t
, so p->sz >= 0
is a tautology and
assert(p->sz <= BUFSIZ)
works too.
QUESTION SYNCH-3F. Write an invariant for p->head
. For full credit
give the most specific true invariant you can.
2 points.
assert( p->head >= 0 && p->head < BUFSIZ );
Again, assert(p->head < BUFSIZ)
is equivalent.
In the remaining questions, you will add synchronization objects and operations to make your mutex pipe work in a multithreaded program. Here is your starting point:
typedef struct mutex_pipe {
1.
char buf[BUFSIZ];
2.
size_t head;
3.
size_t sz;
4.
pthread_mutex_t m;
} mutex_pipe;
int mutex_pipe_init(mutex_pipe* p) {
5.
pthread_mutex_init(&p->m, NULL);
6.
p->head = p->sz = 0;
7.
memset(&p->buf[0], 0, sizeof(p->buf));
8.
return 0;
}
(the rest of the code as in the prior questions)
QUESTION SYNCH-3G. Add calls to “lock
” (pthread_mutex_lock
) and
“unlock
” (pthread_mutex_unlock
) that protect the mutex pipe from
race condition bugs. Write one or more snippets of C code and give line
numbers after which the snippets should appear. For full credit, your
solution must not deadlock—if one thread is reading from a pipe and
another thread is writing to the pipe, then both threads must eventually
make progress.
- Add
pthread_mutex_lock(&p->m);
after lines: 10, 30 - Add
pthread_mutex_unlock(&p->m);
after lines: 22, 42 - Other changes (if any):
After #17 & #39:
if (n == 0) {
pthread_mutex_unlock(&p->m); // or just "unlock"
sched_yield(); // optional
pthread_mutex_lock(&p->m); // or just "lock"
}
Some people might put a sched_yield()
in: nice!
QUESTION SYNCH-3H. Your solution to the last question has poor
utilization. For instance, a thread that calls mutex_pipe_read
on an
empty mutex pipe will spin forever, rather than block. Introduce one or
more condition variables so that mutex_pipe_read
will block until data
is available. Write one or more snippets of C code and give line numbers
after which the snippets should appear.
- Add to
struct mutex_pipe
:pthread_cond_t c;
- Add to
mutex_pipe_init
after line 7:pthread_cond_init(&c, NULL);
- Other changes:
After #17, instead of the code above:
if (n == 0)
pthread_cond_wait(&p->c, &p->m);
After #39:
if (n != 0)
pthread_cond_signal(&p->c);
(This can go anywhere after n
is calculated.)
KERN-4. Teensy OS VM System
The folks at Teensy Computers, Inc, need your help with their VM system. The hardware team that developed the VM system abruptly left and the folks remaining aren't quite sure how VM works. I volunteered you to help them.
The Teensy machine has a 16-bit virtual address space with 4 KB pages. The Teensy hardware specifies a single-level page table. Each entry in the page table is 16-bits. Eight of those bits are reserved for the physical page number and 8 of the bits are reserved for flag values. Sadly, the hardware designers did not document what the bits do!
QUESTION KERN-4A. How many pages are in the Teensy virtual address space?
16 (24)
QUESTION KERN-4B. How many bits comprise a physical address?
20 (8 bits of physical page number + 12 bits of page offset)
QUESTION KERN-4C. Is the physical address space larger or smaller than the virtual address space?
LARGER (20 bits versus 16 bits)
QUESTION KERN-4D. Write, in hex, a PAGE_OFFSET_MASK
(the value
that when anded with an address returns the offset of the address on a
page).
0xFFF
QUESTION KERN-4E. Write a C expression that takes a virtual address,
in the variable vaddr
, and returns the virtual page number.
(vaddr \>\> 12)` OR `((vaddr) & 0xF000 \>\> 12)
You are now going to work with the Teensy engineers to figure out what those other bits in the page table entries mean! Fortunately, they have some engineering notes from the hardware team—they need your help in making sense of them. Each letter below has the contents of a note, state what you can conclude from that note about the lower 8 bits of the page table entries.
QUESTION KERN-4F. "Robin, I ran 8 tests using a kernel that did nothing other than loop infinitely -- for each test I set a different bit in all the PTEs of the page table. All of them ended up in the exception handler except for the one where I set bit 4. Any idea what this means?"
bit 4 is the "present or valid bit"
QUESTION KERN-4G. "Lynn, I'm writing a memory test that iterates over all of memory making sure that I can read back the same pattern I write into memory. If I don't set bit 7 of the page table entries to 1, I get permission faults. Do you know what might be happening?"
bit 7 is the "write" bit
QUESTION KERN-4H. "Pat, I almost have user level processes running! It seems that the user processes take permission faults unless I have both bit 4 and bit 3 set. Do you know why?"
bit 3 is the "user/unprivileged bit"
MISC-1. Debugging
In the following short-answer questions, you have access to five
debugging tools: top
, strace
, gdb
, valgrind
, and man
. You
can’t change program source code or use other tools. Answer the
questions briefly (a couple sentences at most).
QUESTION MISC-1A. You are given a program that appears to “get stuck” when run. How would you distinguish whether the program blocked forever (e.g., made a system call that never returned) or entered an infinite loop?
You can use top
: does it report the process is using 100% of the CPU?
You can use strace
: is the last thing in the strace an incomplete
system call?
QUESTION MISC-1B. You are given a program that uses a lot of memory. How would you tell whether the program leaks memory?
Use valgrind
and check if it reports any memory leaks.
QUESTION MISC-1C. You are given a program that produces weird answers. How would you check if it invoked undefined behavior?
Use valgrind
and check if it reports undefined behavior. GDB is also
acceptable here.
QUESTION MISC-1D. You are given a program that blocks forever. How would you tell where the program blocked (which function called the blocking system call)?
Run it under gdb
. When it blocks, hit Ctrl-C and then enter
backtrace
/bt
to get a backtrace.
QUESTION MISC-1E. You are given a program that takes a long time to produce a result. How would you tell whether the program was using system calls unintelligently?
Run it under strace
and look for stupidity, such as many system calls
that report errors, many system calls that are redundant, lots of
read
s that return short counts, etc.
QUESTION MISC-1F. You are given a program that exits with a system call error, but doesn’t explain what happened in detail. How would you find what error condition occurred and understand the conditions that could cause that error?
Run it under strace
to find the error condition: look for a system
call that returned the error. Then use man
on that system call and
read about the error (the errno
description).
MISC-2. More Miscellany
QUESTION MISC-2A. True or false in conventional Unix systems?
- File descriptors are often used to communicate among processes on the same machine. True
- File descriptors are often used to communicate among processes on different machines. True
- File descriptors are often used to communicate with persistent storage. True
- File descriptors are often used to access primary memory. False
- File descriptors are often used to create child processes. False
QUESTION MISC-2B. Match the process isolation feature on the left with the hardware feature that helps enforce it on the right. Use each hardware feature once (make the best match you can).
|
|
1—a, 2—d, 3—b, 4—c
The remaining questions refer to the following lines of code.
1.
close(fd);
2.
connect(fd, sockaddr, socklen);
3.
listen(fd);
4.
mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
5.
read(fd, buf, 4096);
6.
write(fd, buf, 4096);
QUESTION MISC-2C. If a program executes the following line without error, which lines could be executed next without error? List all numbers that apply.
fd = open("/home/cs61user/cs61-psets/pset6/pong61.c", O_RDWR);
1, 4, 5, 6
QUESTION MISC-2D. If a program executes the following line without error, which lines could be executed next without error? List all numbers that apply.
fd = socket(AF_INET, SOCK_STREAM, 0);
1, 2, 3
QUESTION MISC-2E. If a program executes the following lines without error, which lines could be executed next without error? List all numbers that apply.
pipe(pipefd); fd = pipefd[0];
1, 5