Introduction
Today's section:
- Gives an overview of I/O streams
- Provides examples of common shell utilities and shell syntax
- Explores the sh61 syntax and some structures for parsing it
Pull the latest version of the sections repository by entering in the terminal
git pull
The cs61-section/s10
directory contains a few files we'll use
throughout the section notes and some sample structures we might use to
parse sh61 command lines.
Process I/O
Basics
The three I/O streams that are given by default to every process are stdin, stdout, and stderr.
- stdin is the default input stream for the program. When your C program expects input from the console or wants to read something without opening a file, it usually reads from stdin.
- stdout is the default output stream for the program. When your C program prints something to the console, it is writing to stdout.
- stderr is the default output stream for errors. Why is this useful if we already have a stdout? Sometimes programs want to redirect stdout to another file or program. If we print errors to stdout, we won't notice until it's too late.
For example, if we run $ echo foo
, the following diagram describes how
the echo program interacts with stdin, stdout, and
stderr:

Note that in this example, foo
is a command-line argument, not read in
from stdin.
Redirections
Redirections to redirect a program’s input and output from and to
files of our choosing. For example, let's run $ echo foo > temp.txt
.
If you open up temp.txt
, you'll find that it has a single line
containing foo
. The right angle bracket told our shell to redirect
echo
's stdout to temp.txt
and rewrite it:

If instead we wanted to append to a file, we could use a double right
angle bracket: $ echo bar >> temp.txt
. If you run that command, you'll
find that temp.txt
now has a second line, saying bar
.
We can also redirect the contents of a file to a program's stdin.
Let's run $ wc -w < temp.txt
. wc
is a common Unix program used for
counting words; the -w
command-line argument tells it to count the
number of words. wc
usually takes input directly from the console, but
here we have redirected stdin to a file:

Pipes
But what if we wanted to direct the output of one file directly to the input of another, without sending the output to a file first? We can do this using an operating system abstraction called pipes. Pipes are a way of linking an I/O stream from one program directly to the I/O stream of another program, in real time. Each pipe has a read end and a write end; characters written to the write end can be immediately read from the read end. Each program interacts with its end of a pipe just like any other file descriptor; it can call read, write, close, etc. When a program calls read on the read end of a pipe, the read call blocks until something is written to the write end of the pipe, at which point the read call returns.
Let's run $ echo "foo bar baz" | wc -w
. You should find 3
printed
out at the terminal. What's going on here? We have used the |
character to tell the shell to create a pipe between the echo
and wc
programs, and to redirect echo
's stdout to the write end of the
pipe, and wc
's stdin to the read end of the pipe:

In particular, the shell has used a system call named dup2
to actually
set the file descriptor associated with stdout in echo
to the
write end of the pipe, and to set the file descriptor associated with
stdin in wc
to the read end of the pipe. echo
and wc
have no
idea that they aren't writing to and reading from the console!
More about the Shell
Useful Shell Utilities
Here's some common shell utilities that you may find useful in your daily life and for testing your shell:
Shell Program | Description |
---|---|
cat |
Write standard input to standard output. |
wc |
Count lines, words, and characters in standard input, write result to standard output. |
head -n N |
Print first N lines of standard input. |
tail -n N |
Print last N lines of standard input. |
echo ARG1 ARG2... |
Print arguments. |
printf FORMAT ARG... |
Print arguments with printf-style formatting. |
true |
Always succeed (exit with status 0). |
false |
Always fail (exit with status 1). |
sort |
Sort lines in input. |
uniq |
Drop duplicate lines in input (or print only duplicate lines). |
tr |
Change characters; e.g., tr a-z A-Z makes all letters uppercase. |
ps |
List processes. |
curl URL |
Download URL and write result to standard output. |
sleep N |
Pause for N seconds, then exit with status 0. |
cut |
Cut selected portions of each line of a file. |
Common Shell Syntax
Features of normal shells and sh61
:
- command1 ; command2. Sequencing. Run command1, and when it finishes, run command2.
- command1 & command2. Backgrounding. Start command1, but don't wait for it to finish. Run command2 right away.
- command1 && command2. On success. Run command1. If it finishes by exiting with status 0, run command2.
- command1 || command2. On failure. Run command1. If it finishes by not exiting (e.g., with a segfault), or by exiting with a status ≠ 0, then run command2.
- command1 | command2. Pipe. Run command1 and command2 in parallel. command1’s standard output is hooked up to command2’s standard input. Thus, command2 reads what command1 wrote. The exit status of the pipeline is the exit status of command2.
- command > file. stdout redirection. Run command with its standard output writing to file.
- command < file. stdin redirection. Run command with its standard input reading from file. The file is truncated before the command is run.
- command 2> file. stderr redirection. Run command with its standard error writing to file.
Features of normal shells, but not sh61
:
- var=value. Sets a variable to a value.
- $var. Variable reference. Replaced with the variable’s value.
There are several special variables; for instance,
\$?
expands to the numeric exit status of the most recently executed foreground pipeline, and\$\$
expands to the shell’s own process ID. - command >> file. Run command with its standard output appending to file. The file is not truncated before the command is run.
- command 2>&1. Run command with its standard error redirected to go to the same place as standard output.
- command 1>&2. Run command with its standard output redirected to
go to the same place as standard error. Thus,
echo Error 1\>&2
printsError
to standard error. - (command1; command2). Parentheses group commands into a “subshell.” The entire subshell can have redirections, and can have its output put into a pipe.
- command1 $(command2). Command substitution. The shell runs command2, then passes its output as the first argument to command1.
Precedence:
|
(highest)&&
,||
;
,&
(lowest)
Shell Exercises 1
- Print the contents of the files fork_1.c, fork_2.c, and fork_3.c in
the
cs61-sections/s10
directory, in order, using a single command line. - Repeat #1, but store the result in a file called cs61.
- Do #2 again but produce a different command line.
- What is the exit status of true?
- What is the exit status of false?
- What is the exit status of curl http://ipinfo.io/ip?
- What is the exit status of curl cs61://ipinfo.io/ip?
- curl a URL and print Success if the download succeeds or Fail if the download fails.
- Repeat the above, but downloading the URL to a file called ip.
- Count the number of lines in the file words. Hint: use the wc utility.
- Print every unique line in the file words exactly once.
- Count the number of unique lines in the file words.
- Write a command that could help you discover whether a shell really executes the two sides of a pipeline in parallel. Describe the result if (1) the shell executed the left side to completion first (and buffered the output for the right side to read), (2) the shell executed the sides in parallel.
Exercise: Parsing
So how does the shell do all its magic? Let's talk about the syntax for
sh61
and some ways you might represent it in a data structure.
sh61 Grammar
This, taken from the problem set, is a grammar
representing command lines in sh61
:
commandline ::= list
| list ";"
| list "&"
list ::= conditional
| list ";" conditional
| list "&" conditional
conditional ::= pipeline
| conditional "&&" pipeline
| conditional "||" pipeline
pipeline ::= command
| pipeline "|" command
command ::= [word or redirection]...
redirection ::= redirectionop filename
redirectionop ::= "<" | ">" | "2>"
This is an example of a BNF
Grammar. A BNF
grammar gives recursive definitions for a few terms (the "words" of the
grammar). The ::=
indicates definition (i.e., commandline
is defined
to be list | list ";" | list "&"
. On the definition side, the |
is a
logical or. For example, in sh61's grammar, a commandline
is composed
of a list
, or a list
followed by a semicolon, or a list
followed
by an ampersand.
You may notice that in some of the later definitions, the term being
defined is used in the definition. This recursive definition allows for
lists or trees of terms to be chained together. Let's take the
definition of list
for example:
list ::= conditional
| list ";" conditional
| list "&" conditional
This reads "a list is a conditional, or a list followed by a semicolon and then a conditional, or a list followed by an ampersand and then a conditional." But this means that the list is just a bunch of conditionals, linked by semicolons or ampersands! Notice that the other recursive definitions in sh61's grammar also follow this pattern. In other words:
- A list is a series of conditionals, concatenated by
;
or&
. - A conditional is a series of pipelines, concatenated by
&&
or||
. - A pipeline is a series of commands, concatenated by
|
. - A redirection is one of
<
,>
, or2>
, followed by a filename.
What about the definition of command? [word or redirection]...
seems a
bit vague; in this case, you should use your intuition. When you type a
command into the terminal, it's just a series of words representing the
program name and its arguments, possibly followed by some number of
redirection commands.
Representing a Parsed Command Line
We now consider two ways to represent our parsed data in a data structure appropriate for our shell: a tree representation and a list representation. Students often are biased toward the tree representation, which precisely represents the structure of the grammar, but the list representation is in some ways easier to handle! The tradeoff is simplicity vs. execution time: the list representation requires more work to answer certain important questions about commands. (But command lines are small enough in practice that the extra work doesn’t matter.)
Again, the overall structure of a line is:
- A command is composed of words and redirections.
- A pipeline is composed of commands joined by
\|
. - A conditional is composed of pipelines joined by
&&
or\|\|
- A list is composed of conditionals joined by
;
or&
(and the last command in the list might or might not end with&
).
Tree representation
The following tree-formatted data structure precisely represents this grammar structure.
- A command contains its executable, arguments, and any redirections.
- A pipeline is a linked list of commands. Since commands in a pipeline
are always joined by
\|
, the linked list contains all the structure we need. - A conditional is a linked list of pipelines. But since adjacent
conditionals can be connected by either
&&
or\|\|
, we need to store whether the link is&&
or\|\|
. - A list is a linked list of conditionals, each flagged as foreground
(i.e., joined by
;
) or background (i.e., joined by&
). Note that while the conditional linktype doesn’t matter for the last pipeline in a conditional, the background flag does matter for the last conditional in a list, since the last conditional in a list might be run in the foreground or background.
For instance, consider this command line:
a & b | c && d ; e | f &
which comprises three conditionals, four pipelines, and six commands. (Exercise: What are the four pipelines? What are the three conditionals?)
In tree structure, that would look like this. (The list—the “linked list of conditionals”—is the whole first line.)

(We’re not showing the multiple words and redirections that would be part of each command.)
Questions
- Sketch a set of C structures corresponding to this design.
- How can one traverse your C structures to decide which commands to run at which times?
- How can one traverse them to determine which conditionals should be executed in the foreground or the background?
- Given a command in a pipeline, how can one examine the command C structure to determine whether the command is on the left-hand side of a pipe?
- How about the right-hand side of a pipe?
Flat Linked List
Alternatively, we can create a single linked list of all of the
commands. In this case, we also store the connecting operator (one of
&
;
\|
&&
or \|\|
). For our sample command line
a & b | c && d ; e | f &
that might look like this:

We now consider how we would parse and traverse this data structure.
Questions
- Write a set of C structures corresponding to this design.
- Given a command structure corresponding to the first command in a conditional, how can shell code determine whether the conditional should be executed in the foreground or the background?
- Given a command structure in a pipeline, how can shell code determine whether the command is on the left-hand side of a pipe?
- Given a command structure in a pipeline, how can shell code determine whether the command is on the right-hand side of a pipe?
- Sketch out code for parsing a command line into these C structures.