Process control exercises

Exercises not as directly relevant to this year’s class are marked with ⚠️.

See also question KERN-9.

PROC-1. Rendezvous and pipes

This question builds versions of the existing system calls based on new abstractions. Here are three system calls that define a new abstraction called a rendezvous.

int newrendezvous()
Returns a rendezvous ID that hasn’t been used yet.

int rendezvous(int rid, int data)
Blocks the calling process P1 until some other process P2 calls rendezvous() with the same rid (rendezvous ID). Then, both of the system calls return, but P1’s system call returns P2’s data and vice versa. Thus, the two processes swap their data. Rendezvous acts pairwise; if three processes call rendezvous, then two of them will swap values and the third will block, waiting for a fourth.

void freezerendezvous(int rid, int freezedata)
Freezes the rendezvous rid. All future calls to rendezvous(rid, data) will immediately return freezedata.

Here’s an example. The two columns represent two processes. Assume they are the only processes using rendezvous ID 0.

int result = rendezvous(0, 5); printf("About to rendezvous\n");
int result = rendezvous(0, 600);
/* The processes swap data; both become runnable */
printf("Process A got %d\n", result); printf("Process B got %d\n", result);

This code will print

About to rendezvous
Process B got 5
Process A got 600

(the last 2 lines might appear in either order).

QUESTION PROC-1A. How might you implement pipes in terms of rendezvous? Try to figure out analogues for the pipe(), close(), read(), and write() system calls (perhaps with different signatures), but only worry about reading and writing 1 character at a time.

QUESTION PROC-1B. Can a rendezvous-pipe support all pipe features?

PROC-2. Process management

Here’s the skeleton of a shell function implementing a simple two-command pipeline, such as “cmd1 | cmd2”.

void simple_pipe(const char* cmd1, char* const* argv1, const char* cmd2, char* const* argv2) {
    int pipefd[2], r, status;

    [A]

    pid_t child1 = fork();
    if (child1 == 0) {
        [B]
        execvp(cmd1, argv1);
    }
    assert(child1 > 0);

    [C]

    pid_t child2 = fork();
    if (child2 == 0) {
        [D]
        execvp(cmd2, argv2);
    }
    assert(child2 > 0);

    [E]
}

And here is a grab bag of system calls.

[1] close(pipefd[0]);
[2] close(pipefd[1]);
[3] dup2(pipefd[0], STDIN_FILENO);
[4] dup2(pipefd[0], STDOUT_FILENO);
[5] dup2(pipefd[1], STDIN_FILENO);
[6] dup2(pipefd[1], STDOUT_FILENO);
[7] pipe(pipefd);
[8] r = waitpid(child1, &status, 0);
[9] r = waitpid(child2, &status, 0);

Your task is to assign system call IDs, such as “1”, to slots, such as “A”, to achieve several behaviors, including a correct pipeline and several incorrect pipelines. For each question:

QUESTION PROC-2A. Implement a correct foreground pipeline.

[A] [B] (child1) [C] [D] (child2) [E]
 

QUESTION PROC-2B. Implement a pipeline so that, given arguments corresponding to “echo foo | wc -c”, the wc process reads “foo” from its standard input but does not exit thereafter. For partial credit describe in words how this might happen.

[A] [B] (child1) [C] [D] (child2) [E]
 

QUESTION PROC-2C. Implement a pipeline so that, given arguments corresponding to “echo foo | wc -c”, “foo” is printed to the shell’s standard output and the wc process prints “0”. (In a correctly implemented pipeline, “wc” would print 4, which is the number of characters in “foo\n”.) For partial credit describe in words how this might happen.

[A] [B] (child1) [C] [D] (child2) [E]
 

QUESTION PROC-2D. Implement a pipeline that appears to work correctly on “echo foo | wc -c”, but always blocks forever if the left-hand command outputs more than 65536 characters. For partial credit describe in words how this might happen.

[A] [B] (child1) [C] [D] (child2) [E]
 

QUESTION PROC-2E. Implement a pipeline so that, given arguments corresponding to “echo foo | wc -c”, both echo and wc report a “Bad file descriptor” error. (This error, which corresponds to EBADF, is returned when a file descriptor is not valid or does not support the requested operation.) For partial credit describe in words how this might happen.

[A] [B] (child1) [C] [D] (child2) [E]
 

PROC-3. Processes

Consider the two programs shown below.

// Program 1
#include <cstdio>
#include <unistd.h>
int main() {
    printf("PID %d running prog1\n", getpid());
}
// Program 2
#include <cstdio>
#include <unistd.h>
int main() {
    char* argv[2];
    argv[0] = (char*) "prog1";
    argv[1] = nullptr;
    printf("PID %d running prog2\n", getpid());
    int r = execv("./prog1", argv);
    printf("PID %d exiting from prog2\n", getpid());
}

QUESTION PROC-3A. How many different PIDs will print out if you run Program 2?

QUESTION PROC-3B. How many lines of output will you see?

Now, let's assume that we change Program 2 to the following:

// Program 2B
#include <cstdio>
#include <unistd.h>
int main() {
    char* argv[2];
    argv[0] = (char*) "prog1";
    argv[1] = nullptr;

    printf("PID %d running prog2\n", getpid());
    pid_t p = fork();
    if (p == 0) {
        int r = execv("./prog1", argv);
    } else {
        printf("PID %d exiting from prog2\n", getpid());
    }
}

QUESTION PROC-3C. How many different PIDs will print out if you run Program 2B?

QUESTION PROC-3D. How many lines of output will you see?

Finally, consider this version of Program 2.

// Program 2C
#include <cstdio>
#include <unistd.h>
int main() {
    char* argv[2];
    argv[0] = (char*) "prog1";
    argv[1] = nullptr;

    printf("PID %d running prog2\n", getpid());
    pid_t p = fork();
    pid_t q = fork();
    if (p == 0 || q == 0) {
        int r = execv("./prog1", argv);
    } else {
        printf("PID %d exiting from prog2\n", getpid());
    }
}

QUESTION PROC-3E. How many different PIDs will print out if you run Program 2C?

QUESTION PROC-3F. How many lines of output will you see?

PROC-4. Be a CS61 TF!

You are a CS61 teaching fellow. A student working on A4 is having difficulty getting pipes working. S/he comes to you for assistance. The function below is intended to traverse a linked list of commands, fork/exec the indicated processes, and hook up the pipes between commands correctly. The student has commented it reasonably, but is quite confused about how to finish writing the code. Can you help? Figure out what code to add at points A, B, and C.

#include "sh61.hh"

struct command {
    command *next; // Next in sequence of commands
    int argc;      // number of arguments
    int ispipe;    // pipe symbol follows this command
    char** argv;   // arguments, terminated by NULL
    pid_t pid;     // pid running this command
};

void do_pipes(command* c) {
    pid_t newpid;
    int havepipe = 0;   // We had a pipe on the previous command
    int lastpipe[2] = {-1, -1};
    int curpipe[2];

    do {
        if (c->ispipe) {
            int r = pipe(curpipe);
            assert(r == 0);
        }

        newpid = fork();
        assert(newpid >= 0);
        if (newpid == 0) {
            if (havepipe) {
                // There was a pipe on the last command; It's stored
                // in lastpipe; I need to hook it up to this process???
                // **** PART A ****
            }
            if (c->ispipe) {
                // The current command is a pipe -- how do I hook it up???
                // **** PART B ****
            }

            execvp(c->argv[0], c->argv);

            fprintf(stderr, "Exec failed\n");
            _exit(1);
        }

        // I bet there is some cleanup I have to do here!?
        // **** PART C ****

        // Set up for the next command
        havepipe = c->ispipe;
        if (c->ispipe) {
            lastpipe[0] = curpipe[0];
            lastpipe[1] = curpipe[1];
        }
        c->pid = newpid;
        c = c->next;
    } while (newpid != -1 && havepipe);
}

QUESTION PROC-4A. What should go in the Part A space above, if anything?

QUESTION PROC-4B. What should go in the Part B space above, if anything?

QUESTION PROC-4C. What should go in the Part C space above, if anything?

PROC-5. Spork

Patty Posix has an idea for a new system call, spork. Her system call combines fork, file descriptor manipulations, and execvp. It’s pretty cool:

struct spork_file_action_t {
    int type; // equals SPORK_OPEN, SPORK_CLOSE, or SPORK_DUP2
    int fd;
    int old_fd;            // SPORK_DUP2 only
    const char* filename;  // SPORK_OPEN only
    int flags;             // SPORK_OPEN only
    mode_t mode;           // SPORK_OPEN only
};

pid_t spork(const char* file, const spork_file_action_t* file_actions, int n_file_actions, char* argv[]);

Here’s how spork works.

  1. First, spork forks a child process.
  2. The child process loops over the file_actions array (there are n_file_actions elements) and performs each file action in turn. A file action fa means different things depending on its type. Specifically:
    • fa->type == SPORK_OPEN: The child process opens the file named fa->filename with flags fa->flags and optional mode fa->mode, as if by open(fa->filename, fa->flags, fa->mode). The opened file descriptor is given number fa->fd. (Note that this requires multiple steps, since the file must be first opened and then moved to fa->fd.)
    • fa->type == SPORK_CLOSE: The child process closes file descriptor fa->fd.
    • fa->type == SPORK_DUP2: The child process makes fa->fd a duplicate of fa->old_fd.
  3. Finally, the child process execs the given file with argument list argv.
  4. If all these steps succeed, then spork returns the child process ID. If any of the steps fails, then either spork returns –1 and creates no child, or the child process exits with status 127. In particular, if a file action fails, then the child process exits with status 127 (and does not call exec).

This function uses spork to print the number of words in a file to standard output.

void print_word_count(const char* file) {
    spork_file_action_t file_actions[1];
    file_actions[0].type = SPORK_OPEN;
    file_actions[0].fd = STDIN_FILENO;
    file_actions[0].filename = file;
    file_actions[0].flags = O_RDONLY;
    const char* argv[2] = {"wc", nullptr};
    pid_t p = spork("wc", file_actions, 1, argv);
    assert(p >= 0);
    waitpid(p, NULL, 0);
}

QUESTION PROC-5A. Use spork to implement the following function.

// Create a pipeline like `argv1 | argv2`.
// The pipeline consists of two child processes, one running the command with argument
// list `argv1` and one running the command with argument list `argv2`. The standard
// output of `argv1` is piped to the standard input of `argv2`.
// Return the PID of the `argv2` process or -1 on failure.
pid_t make_pipeline(char* argv1[], char* argv2[]);

QUESTION PROC-5B. Now, implement spork in terms of system calls you already know. For full credit, make sure you catch all errors. Be careful of SPORK_OPEN.

QUESTION PROC-5C. Can fork be implemented in terms of spork? Why or why not?

QUESTION PROC-5D. At least one of the file action types is redundant, meaning a spork caller could simulate its behavior using the other action types and possibly some additional system calls. Say which action types are redundant, and briefly describe how they could be simulated.

PROC-6. File descriptor facts

Here are twelve file descriptor-oriented system calls.

accept bind close connect dup2 listen
open pipe read select socket write

⚠️ The accept, bind, connect, listen, and socket system calls are covered in the synchronization unit.

QUESTION PROC-6A. Which of these system calls may cause the number of open file descriptors to increase? List all that apply.

QUESTION PROC-6B. Which of these system calls may close a file descriptor? List all that apply. Note that some system calls might close a file descriptor even though the total number of open file descriptors remains the same.

QUESTION PROC-6C. Which of these system calls can block? List all that apply.

QUESTION PROC-6D. Which system calls can open at least one file descriptor where that file descriptor is suitable for both reading and writing? List all that apply.

QUESTION PROC-6E. ⚠️ Which system calls must a network server make in order to receive a connection on a well-known port? List all that apply in order, first to last. Avoid unnecessary calls.

QUESTION PROC-6F. ⚠️ Which system calls must a network client make in order to (1) connect to a server, (2) send a message, (3) receive a reply, and (4) close the connection? List all that apply in order, first to last. Avoid unnecessary calls.

PROC-7. Duplication

Mark Zuckerberg hates duplicates (because Winklevii). He especially hates the dup2 system call, because he can’t remember its order of arguments.

⚠️ Some parts of this question rely on material from the synchronization unit.

QUESTION PROC-7A. What is the order of arguments for dup2? Is it (A) dup2(oldfd, newfd) or (B) dup2(newfd, oldfd)? (Here, oldfd is the pre-existing file descriptor.)

Mark wants to make dup2 obsolete by changing other system calls. He wants to:

These system calls can fail, meaning they return –1 and set errno to an error code.

QUESTION PROC-7B. Assuming a single-threaded process, show how to implement open’s functionality in terms of these new system calls. Don’t worry about mode. Unix’s open cannot set errno to EBADF; neither should yours.

int open(const char* path, int oflag) {

QUESTION PROC-7C. ⚠️ Your open implementation likely has a race condition if used in a multithreaded process: a bug can occur if two threads call open at about the same time. Explain this race condition briefly (two or three sentences max).

QUESTION PROC-7D. ⚠️ Solve this race condition using synchronization objects. You may introduce global variables, which we’ll assume are initialized. Again, be careful to handle error conditions properly.

int open(const char* path, int oflag) {

QUESTION PROC-7E. Can these system calls (without dup or dup2) implement a shell pipeline? Why or why not? Be brief.

Sheryl Sandberg is sympathetic to Mark’s psychological issues, but suggests instead that he replace dup and dup2 with:

QUESTION PROC-7F. Complete the following function, using at least pipe, fork, execvp, fdswap, and close. Do not use dup, dup2, or pipeonto, and don’t worry too much about error conditions. Make sure to implement pipe hygiene. Hint: The programs in cs61-lectures/synch2 might be useful references.

// simplepipeline_fdswap(cmd1, cmd2)
//    Fork and execute the pipeline `cmd1 | cmd2`. Return the `pid_t` corresponding to `cmd2`
//    (or return -1 with an appropriate error code if the pipeline could not be created).

pid_t simplepipeline_fdswap(const char* cmd1, const char* cmd2) {
    char* const cmd1_argv[] = { (char*) cmd1, nullptr };
    char* const cmd2_argv[] = { (char*) cmd2, nullptr };

PROC-8. Tripe

This question concerns a new pipe-like feature called a tripe. A tripe has one write end and two read ends. Any data written on the write end is duplicated onto both read ends, which observe the same stream of characters.

QUESTION PROC-8A. A tripe can be emulated using regular pipes and a helper process, where the helper process runs the following code:

     void run_tripe(int fd1, int fd2, int fd3) {
/*1*/    while (true) {
/*2*/        char ch;
/*3*/        ssize_t n = read(fd1, &ch, 1);
/*4*/        assert(n == 1);
/*5*/        n = write(fd2, &ch, 1);
/*6*/        assert(n == 1);
/*7*/        n = write(fd3, &ch, 1);
/*8*/        assert(n == 1);
/*9*/    }
     }

How many regular pipes are required?

QUESTION PROC-8B. If run_tripe encounters end of file on fd1, it will assert-fail. It should instead close the write ends and exit. Change the code so it does this, using line numbers to indicate where your code goes. You may assume that read and write never return an error.

QUESTION PROC-8C. “Pipe hygiene” refers to closing unneeded file descriptors. What could go wrong with the tripe if its helper process had bad pipe hygiene (open file descriptors referring to unneeded pipe ends)? List all that apply.

  1. Undefined behavior.
  2. read from either tripe read end would never return end-of-file.
  3. write to the write end would never block.
  4. write to the write end would never detect the closing of a read end.
  5. None of the above.

PROC-9. Process control explanations

QUESTION PROC-9A. Match each system call with the shell feature that requires that system call for implementation. You will use each system call once.

  1. chdir
  2. fork
  3. open
  4. pipe
  5. waitpid without WNOHANG
  1. ;
  2. >
  3. |
  4. cd
  5. Most command lines use this

QUESTION PROC-9B. What is the best explanation for the function of zombie processes in Unix? Select one.

  1. Zombie processes automatically kill other processes that leak too much memory, keeping the OS as a whole more stable.
  2. Zombie processes ensure that process IDs are never reused.
  3. Zombie processes allow parent processes to reliably collect the statuses of their exited children.
  4. Zombie processes cannot be killed by most signals.
  5. None of the above.

QUESTION PROC-9C. What is the best explanation for why signal handlers in Unix should be kept simple? Select one.

  1. Signal handlers run with interrupts disabled, so a complex signal handler could violate process isolation.
  2. Signal handlers can be invoked at any moment, including in the middle of another function, so a complex signal handler is likely to violate assumptions on which the program and/or C++ libraries depend.
  3. The kernel ensures that signal handlers have no effect on primary memory.
  4. Signal handlers are called by the kernel rather than being called explicitly by other parts of the program.
  5. None of the above.

QUESTION PROC-9D. What is the best explanation for why the pipe system call precedes the fork system call when setting up a pipeline in a shell? Select one.

  1. The fork system call creates a new, isolated process whose file descriptor table cannot be influenced by other processes.
  2. In the shell grammar, a command is part of a pipeline, so the pipe is encountered first, before the command is forked.
  3. Pipe hygiene requires that all extraneous ends of a pipe are closed in all processes, including the parent shell.
  4. The read end of a pipe is created after the write end.
  5. None of the above.

QUESTION PROC-9E. What is the best explanation for the difference between foreground conditional chains and background conditional chains? Select one.

  1. In a foreground conditional chain, the shell waits for each pipeline in sequence, while in a background conditional chain, all commands in the pipelines run in parallel.
  2. Foreground conditional chains never create zombie processes.
  3. Background conditional chains do not write to the shell’s standard output.
  4. Foreground conditional chains do not create subshells.
  5. None of the above.

PROC-10. Piping hot

QUESTION PROC-10A. Fill in the blanks so that each command line prints 61, and only 61, to the shell’s standard output. The resulting command lines should be valid according to the shell grammar. At most one of your fragments may contain the string 61.

  1. _________
  2. _________ || echo 61
  3. false && _________ && echo 61
  4. echo ab | tr _________
  5. yes _________ echo 61

In the remaining questions, we provide strace output for attempts at a shell running a two-process pipeline, echo foo | wc -c. For each question, you are to characterize the shell. This is Shell X1.

58797 pipe([3, 4])                      = 0
58797 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f892b19ea10) = 58798
58797 close(4)                          = 0
58798 close(3)                          = 0
58797 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f892b19ea10) = 58799
58798 dup2(4, 1)                        = 1
58798 close(4)                          = 0
58798 execve("/bin/echo", ["/bin/echo", "foo"], 0x7ffdea26fe98 /* 57 vars */ <... detached ...>
58797 close(3)                          = 0
58797 wait4(58798,  <... unfinished ...>
58799 dup2(3, 0)                        = 0
58799 close(3)                          = 0
58799 execve("/usr/bin/wc", ["/usr/bin/wc", "-c"], 0x7ffdea26fe98 /* 57 vars */ <... detached ...>
58797 <... wait4 resumed> NULL, 0, NULL) = 58798
58797 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=58798, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
58797 wait4(58799, NULL, 0, NULL)       = 58799
58797 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=58799, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
58797 exit_group(0)                     = ?
58797 +++ exited with 0 +++

QUESTION PROC-10B. What is the process ID of the parent shell?

QUESTION PROC-10C. Does Shell X1 wait for the right-hand process, the left-hand process, or both?

QUESTION PROC-10D. Does Shell X1 appear to implement two-process pipelines correctly?

QUESTION PROC-10E. This is Shell X2. It is incorrect.

58969 pipe([3, 4])                      = 0
58969 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6d566b0a10) = 58970
58970 close(3)                          = 0
58970 dup2(4, 1)                        = 1
58970 close(4)                          = 0
58970 execve("/bin/echo", ["/bin/echo", "foo"], 0x7ffcc0a30220 /* 57 vars */ <... detached ...>
58969 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6d566b0a10) = 58971
58971 dup2(3, 0)                        = 0
58971 close(3)                          = 0
58971 execve("/usr/bin/wc", ["/usr/bin/wc", "-c"], 0x7ffcc0a30220 /* 57 vars */ <... detached ...>
58969 close(3)                          = 0
58969 close(4)                          = 0
58969 wait4(58970,  <... unfinished ...>
58969 <... wait4 resumed> NULL, 0, NULL) = 58970
58969 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=58970, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
58969 wait4(58971, NULL, 0, NULL)       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
58969 --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
58969 +++ killed by SIGINT +++

Give an strace line, including process ID, that, if added in the right place, would make Shell X2 appear to implement two-process pipelines correctly.

QUESTION PROC-10F. This is Shell X3. It is incorrect, though it appears to run correctly on this specific pipeline.

59026 pipe([3, 4])                      = 0
59026 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f0dfa757a10) = 59027
59026 close(4)                          = 0
59027 close(3)                          = 0
59026 wait4(59027,  <... unfinished ...>
59027 dup2(4, 1)                        = 1
59027 close(4)                          = 0
59027 execve("/bin/echo", ["/bin/echo", "foo"], 0x7ffe0672b1b0 /* 57 vars */ <... detached ...>
59026 <... wait4 resumed> NULL, 0, NULL) = 59027
59026 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=59027, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
59026 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f0dfa757a10) = 59028
59026 close(3)                          = 0
59026 wait4(59028,  <... unfinished ...>
59028 dup2(3, 0)                        = 0
59028 close(3)                          = 0
59028 execve("/usr/bin/wc", ["/usr/bin/wc", "-c"], 0x7ffe0672b1b0 /* 57 vars */ <... detached ...>
59026 <... wait4 resumed> NULL, 0, NULL) = 59028
59026 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=59028, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
59026 exit_group(0)                     = ?
59026 +++ exited with 0 +++

Give a two-process Unix pipeline that Shell X3 will not appear to run correctly, and/or briefly describe problem with Shell X3.