Kernel Section 3: Pipes

College students: Attendance at section is required and recorded for all college students. All college students should fill out this attendance form when attending section. For more on the attendance policy and what to do if you miss section refer to the syllabus here and here.

Extension School students: Extension School students are required to self-report weekly section participation using this participation form. There are two ways to participate in section: (1) attend section live (for most students, this is one of the zoom sections, but local students are welcome to attend in-person sections) OR (2) watch a section recording (Canvas > Zoom > Published Recordings). Both participation methods are equally valid towards your participation grade.

This section bridges the kernel unit with our next unit, on data storage, by investigating the implementation of one of the most important and powerful abstractions provided by modern operating system kernels: the pipe.

Update your section code repository before beginning.

Part A: Inter-process communication

Processes in WeensyOS, like processes in any real operating system, are primarily isolated—one process can’t modify another’s memory, for example. But processes totally isolated from one another would be useless, and abstractions that allow processes to communicate are an important aspect of OS design.

Real OS inter-process communication mechanisms include:

We’ll focus on a particularly important communication channel, the byte stream.

Byte stream communication works like this.

EXERCISE A1. Take 5 minutes or so to design a system call interface for byte stream communication in WeensyOS. Think about questions like:

  1. (Very important for prototyping!) How simple can you make the interface?
  2. How shall a new byte stream be created?
  3. Which processes should have access to the byte stream?
  4. System calls are inherently expensive, since they involve protected control transfers. Can you minimize the number of system calls it takes to transfer a message from a writer process to a reader process?
  5. How should bytes be stored when they are in transit (between when the writer sends them and when the reader receives them)?

Part B: Byte streams in WeensyOS

The WeensyOS we handed out in cs61-sections/kernels1 supports byte streams already! Run make run-pipe to see it in action.

The writer process is defined in p-pipewriter.cc, the reader process in p-pipereader.cc. The writer repeatedly (1) picks a random message to send (each message is terminated by a newline '\n'), (2) writes that message to the system pipe, (3) prints a success note to the screen, and then (4) waits for 1–3 seconds. The reader repeatedly (1) reads a message up to a newline and then (2) prints a success note to the screen.

These processes use system calls sys_pipewrite and sys_piperead, which access a single system-wide byte stream called the system pipe. Their specifications are:

// sys_pipewrite(buf, sz)
//    Copy up to `sz` bytes of data from `buf` into the system pipe.
//    Returns number of bytes written or a negative error code (if, for
//    instance, the system pipe’s data transfer buffer is full).
ssize_t sys_pipewrite(const void* buf, size_t sz);

// sys_piperead(buf, sz)
//    Read up to `sz` bytes of data from the system pipe into `buf`.
//    Bytes read are removed from the system pipe.
//    Returns number of bytes read or a negative error code (if, for
//    instance, the system pipe’s data transfer buffer is empty).
ssize_t sys_piperead(void* buf, size_t sz);

EXERCISE B1. How does this interface differ from the interface you designed? Is it simpler or more complicated? How does the interface stack up against the questions in Exercise A1?

The kernel implementations of these system calls live in kernel.cc; here they are.

char pipebuf[1];
size_t pipebuf_len = 0;

ssize_t syscall_pipewrite(const char* buf, size_t sz) {
    // See `sys_pipewrite` in `u-lib.cc` for specification.
    if (sz == 0) {
        // nothing to write
        return 0;
    } else if (pipebuf_len == 1) {
        // kernel buffer full, process should try again
        return E_AGAIN;
    } else {
        // write one byte
        pipebuf[0] = buf[0];
        pipebuf_len = 1;
        return 1;
    }
}

ssize_t syscall_piperead(char* buf, size_t sz) {
    // See `sys_piperead` in `u-lib.cc` for specification.
    if (sz == 0) {
        // no room to read
        return 0;
    } else if (pipebuf_len == 0) {
        // kernel buffer empty, process should try again
        return E_AGAIN;
    } else {
        // read one byte
        buf[0] = pipebuf[0];
        pipebuf_len = 0;
        return 1;
    }
}

EXERCISE B2. Where are bytes stored when they are in transit (i.e., after the writer sends them, but before the reader receives them)?

EXERCISE B3. What’s the maximum number of in-transit bytes that can stored simultaneously?

EXERCISE B4. What’s the minimum number of sys_pipewrite system calls required to write an N-byte message?

EXERCISE B4. What’s the minimum number of sys_piperead system calls required to read an N-byte message?

EXERCISE B5. When will sys_pipewrite return zero?

EXERCISE B6. Run make run-pipe and check out the number of system calls actually required to write and read N-byte messages. What do you observe?

Part C: Scheduling

In the rest of section, you’ll improve the kernel’s implementation of sys_pipewrite and sys_piperead to reduce the number of system calls per message as much as possible. There are three main techniques you’ll use: changing scheduling (the order processes run), increasing the transfer buffer size, and implementing blocking.

EXERCISE C1. Why does the pipe writer take so many system calls to write a message?

Hint: A good first step is to observe what’s happening in more detail. You may want to use log_printf to record all system calls made in order. For example, uncomment this line in syscall() in kernel.cc:

// log_printf("p%d: %s\n", current->pid, syscall_name(regs->reg_rax));

log_printf slows down WeensyOS a lot, so the system call counts will decrease, but the effects will still be visible.

EXERCISE C2. Change the sys_pipewrite system call implementation to reduce this problem.

EXERCISE C3. Change the sys_piperead system call implementation to reduce this problem.

Part D: Buffer size

Even with these changes, it takes a minimum of one system call per byte to transfer a message. This is because the WeensyOS transfer buffer, which stores bytes in transit, is only one byte big.

EXERCISE D1. Change the size of the pipe buffer to fit more than one byte, and change the sys_pipewrite and sys_piperead handlers to use this bigger pipe buffer. Can you get the pipe writer down to 1 or 2 system calls per message?

Hint: Don’t be overly concerned with the performance of the system call implementation—feel free to copy data within the buffer as convenient, or to copy one byte at a time in an in-kernel loop.

EXERCISE D2. Why is the pipe reader still making so many system calls on the second and subsequent messages?

Part E: Blocking

In the final part of the section, change the WeensyOS kernel so that the relevant system calls block instead of poll.

Blocking and polling are different techniques that processes can use while waiting for an event. In a polling implementation, a process remains runnable and constantly checks for whether an event has occurred. (It’s like a kid in a car ride: “Are we there yet? Are we there yet? Are we there yet? Are we there yet?”) Polling is often relatively easy to implement and can detect quickly when an event occurs, but while waiting for the event, polling wastes a lot of CPU time on active checks. In a blocking implementation, a process becomes non-runnable until the event occurs, when it is explicitly woken up. A blocking implementation is generally more complex than a polling implementation, but more efficient.

WeensyOS represents process runnability with the proc::state member. A runnable process has ptable[PID].state == P_RUNNABLE. To block a process, set its state to P_BLOCKED.

EXERCISE E1. Update the pipe reader and/or pipe writer to avoid the behavior you analyzed in Exercise D2.

Optional advanced work

This exercise touches on several big ideas in operating systems design, with respect to both performance (scheduling, blocking vs. polling) and convenience (the pipe abstraction). Here’s how to go further.