Pipes and Redirection

Before the midterm, we started getting some practice with fork and exec; most of you (who came to class) were able to get processes running, but didn't get a chance to play with pipes. Today we'll experiment with both pipes and redirection of standard in/out

Learning Objectives

Be able to complete parts 1-4 of assignment 4.
Redirect standard in and standard out
Construct simple pipelines
Implement proper pipe and file descriptor hygiene.
Develop/identify test programs that will help you debug your shell

Getting Started

You can find today's code in the l16 directory of the cs61-exercises repository.

Cat Wrapper

You should be familiar with the cat command, which displays a file. If not, you can always check out the man page.

We're going to start by writing some simple wrappers to give you practice redirecting standard in and standard out. Take a look at the file cat_wrapper_out.c. This is a skeleton program whose goal is to execute the cat command, redirecting its output to the file cat.out. (We didn't claim this was a particularly useful program, but it will get us started.)

1. Complete the program by adding two system calls: one that makes the fd obtained by the open the target of standard out and one that exhibits proper file descriptor hygiene.

2. Once that's running, copy cat_wrapper_out.c to cat_wrapper_in.c and modify the program to redirect standard input instead of standard output (you can pick whatever you want to be the file from which you take your input -- we used the same file we produced with cat_wrapper_out). You can make your Makefile build cat_wrapper_in by adding it to the PROGRAMS line.

You might find it instructive to run one of your wrapper programs under strace. (You will probably find it easier to do this with cat_wrapper_out. Why?)

3. Important Question: Consider cat_wrapper_out, let's say that you forked a child right before the block of code in which you call execvp and then ran cat in the child process -- would you expect the output to be displayed on the screen or placed in the file? Why? Try it and see if you were right. What if you moved the fork before the open, what would you expect?

Take Away

Redirection is an incredibly powerful concept that lets us make effective use of the command line, and it's relatively trivial to implement given the right APIs.

optional: Test your understanding of argument processing by copying cat_wrapper_out.c to wrapper_out.c and writing a wrapper that takes any command and command line (e.g., argument vector) and executes it, redirecting its standard output to a specific file.

Fork meets Assignment 3

Read the program in fork2.c.

1a. When you run this, how many processes will be created? Run the program to see if you were right.

b. Now, run the program a couple of times from the shell, and then run the program using the shell to redirect standard output to a file. What happened? Can you explain why?

HINT1

HINT2

c. What do you think the following will produce: ./fork2 | cat? Try it and see.

d. Read HINT2 above and predict what will happen if you remove the \n characters from the printf statements.

2a. Sometimes system builders do not get everything right... Take a look at the program forkmix.c. Predict how many total characters will be output when you run this program (don't forget the newline character).

b. Now, run this program using a pipe to count the characters:

% ./forkmix | wc

Was the output what you expected?

c. Now, let's redirect output to a file and then count the characters in the file:

% ./forkmix > OUT && wc OUT

You would think that these should produce the same numbers. Do they?

Here is why they do not!

3. Finally, take a look at forkmix2.c. How many characters do you expect to find in the output file? Why? Run the program with:

% ./forkmix2 OUTPUT-FILE

Were you correct?

Take Away

Now that you understand caching from assignment 3, you can understand complex interactions between parent and child processes.

When programming with multiple processes, don't lose track of process state that ends up shared between parents and children.

Make sure you understand why fork mix and forkmix2 behave differently.

More Pipes

There are two primary motivations for making sure that we properly close pipe file descriptors: 1) if we do not, programs might behave incorrectly (e.g., a reader may never exit) 2) pipes consume kernel resources. How might we figure out how much memory pipes consume?

Idea: What if we continually write to a pipe, but the reader never reads from it?

Let's try it! Read, build and run pipesizer.c. How much space does the kernel allocate for a pipe buffer?

The video presented the idea of "pipe hygience" demonstrating how things can go wrong if you don't close things properly. Let's step through a couple of examples of pipelines where you will diagnose the condition that causes programs to exit.

Each of the following create pipelines; explain what actions cause the programs to exit. Run strace to verify your answer. (Feel free to consult the man page for any programs you do not recognize.)

1. yes "I love Cs61!" | head -10

2. echo foo | wc -l

3. cat < my-file | tail -4

4. cat /dev/null | wc

5. One last question:

The fork/exec video showed how to use waitpid to wait for a process to exit. Explain how you might use a pipe to wait for a child to exit. What are the advantages and disadvantages of each approach (i.e., waitpid versus the pipe).

Wrapping Up

Please fill out this survey.