Ancient CS 61 Content Warning!!!!!1!!!
This is not the current version of the class.
This site was automatically translated from a wiki. The translation may have introduced mistakes (and the content might have been wrong to begin with).

Introduction

Today's section:

Pull the latest version of the sections repository by entering in the terminal

git pull

The cs61-section/s10 directory contains a few files we'll use throughout the section notes and some sample structures we might use to parse sh61 command lines.

Process I/O

Basics

The three I/O streams that are given by default to every process are stdin, stdout, and stderr.

For example, if we run $ echo foo, the following diagram describes how the echo program interacts with stdin, stdout, and stderr:

Section10EchoFoo.png

Note that in this example, foo is a command-line argument, not read in from stdin.

Redirections

Redirections to redirect a program’s input and output from and to files of our choosing. For example, let's run $ echo foo > temp.txt. If you open up temp.txt, you'll find that it has a single line containing foo. The right angle bracket told our shell to redirect echo's stdout to temp.txt and rewrite it:

Section10EchoFooRedirect.png

If instead we wanted to append to a file, we could use a double right angle bracket: $ echo bar >> temp.txt. If you run that command, you'll find that temp.txt now has a second line, saying bar.

We can also redirect the contents of a file to a program's stdin. Let's run $ wc -w < temp.txt. wc is a common Unix program used for counting words; the -w command-line argument tells it to count the number of words. wc usually takes input directly from the console, but here we have redirected stdin to a file:

Section10wc.png

Pipes

But what if we wanted to direct the output of one file directly to the input of another, without sending the output to a file first? We can do this using an operating system abstraction called pipes. Pipes are a way of linking an I/O stream from one program directly to the I/O stream of another program, in real time. Each pipe has a read end and a write end; characters written to the write end can be immediately read from the read end. Each program interacts with its end of a pipe just like any other file descriptor; it can call read, write, close, etc. When a program calls read on the read end of a pipe, the read call blocks until something is written to the write end of the pipe, at which point the read call returns.

Let's run $ echo "foo bar baz" | wc -w. You should find 3 printed out at the terminal. What's going on here? We have used the | character to tell the shell to create a pipe between the echo and wc programs, and to redirect echo's stdout to the write end of the pipe, and wc's stdin to the read end of the pipe:

Section10EchoFooPipe.png

In particular, the shell has used a system call named dup2 to actually set the file descriptor associated with stdout in echo to the write end of the pipe, and to set the file descriptor associated with stdin in wc to the read end of the pipe. echo and wc have no idea that they aren't writing to and reading from the console!

More about the Shell

Useful Shell Utilities

Here's some common shell utilities that you may find useful in your daily life and for testing your shell:

Shell Program Description
cat Write standard input to standard output.
wc Count lines, words, and characters in standard input, write result to standard output.
head -n N Print first N lines of standard input.
tail -n N Print last N lines of standard input.
echo ARG1 ARG2... Print arguments.
printf FORMAT ARG... Print arguments with printf-style formatting.
true Always succeed (exit with status 0).
false Always fail (exit with status 1).
sort Sort lines in input.
uniq Drop duplicate lines in input (or print only duplicate lines).
tr Change characters; e.g., tr a-z A-Z makes all letters uppercase.
ps List processes.
curl URL Download URL and write result to standard output.
sleep N Pause for N seconds, then exit with status 0.
cut Cut selected portions of each line of a file.

Common Shell Syntax

Features of normal shells and sh61:

Features of normal shells, but not sh61:

Precedence:

Shell Exercises 1

  1. Print the contents of the files fork_1.c, fork_2.c, and fork_3.c in the cs61-sections/s10 directory, in order, using a single command line.
  2. Repeat #1, but store the result in a file called cs61.
  3. Do #2 again but produce a different command line.
  4. What is the exit status of true?
  5. What is the exit status of false?
  6. What is the exit status of curl http://ipinfo.io/ip?
  7. What is the exit status of curl cs61://ipinfo.io/ip?
  8. curl a URL and print Success if the download succeeds or Fail if the download fails.
  9. Repeat the above, but downloading the URL to a file called ip.
  10. Count the number of lines in the file words. Hint: use the wc utility.
  11. Print every unique line in the file words exactly once.
  12. Count the number of unique lines in the file words.
  13. Write a command that could help you discover whether a shell really executes the two sides of a pipeline in parallel. Describe the result if (1) the shell executed the left side to completion first (and buffered the output for the right side to read), (2) the shell executed the sides in parallel.

Solutions here.

Exercise: Parsing

So how does the shell do all its magic? Let's talk about the syntax for sh61 and some ways you might represent it in a data structure.

sh61 Grammar

This, taken from the problem set, is a grammar representing command lines in sh61:

commandline ::= list
          |  list ";"
          |  list "&"
list     ::=  conditional
          |   list ";" conditional
          |   list "&" conditional
conditional ::=  pipeline
          |   conditional "&&" pipeline
          |   conditional "||" pipeline
pipeline ::=  command
          |   pipeline "|" command
command  ::=  [word or redirection]...
redirection  ::=  redirectionop filename
redirectionop  ::=  "<"  |  ">"  |  "2>"

This is an example of a BNF Grammar. A BNF grammar gives recursive definitions for a few terms (the "words" of the grammar). The ::= indicates definition (i.e., commandline is defined to be list | list ";" | list "&". On the definition side, the | is a logical or. For example, in sh61's grammar, a commandline is composed of a list, or a list followed by a semicolon, or a list followed by an ampersand.

You may notice that in some of the later definitions, the term being defined is used in the definition. This recursive definition allows for lists or trees of terms to be chained together. Let's take the definition of list for example:

list     ::=  conditional
          |   list ";" conditional
          |   list "&" conditional

This reads "a list is a conditional, or a list followed by a semicolon and then a conditional, or a list followed by an ampersand and then a conditional." But this means that the list is just a bunch of conditionals, linked by semicolons or ampersands! Notice that the other recursive definitions in sh61's grammar also follow this pattern. In other words:

What about the definition of command? [word or redirection]... seems a bit vague; in this case, you should use your intuition. When you type a command into the terminal, it's just a series of words representing the program name and its arguments, possibly followed by some number of redirection commands.

Representing a Parsed Command Line

We now consider two ways to represent our parsed data in a data structure appropriate for our shell: a tree representation and a list representation. Students often are biased toward the tree representation, which precisely represents the structure of the grammar, but the list representation is in some ways easier to handle! The tradeoff is simplicity vs. execution time: the list representation requires more work to answer certain important questions about commands. (But command lines are small enough in practice that the extra work doesn’t matter.)

Again, the overall structure of a line is:

Tree representation

The following tree-formatted data structure precisely represents this grammar structure.

For instance, consider this command line:

a & b | c && d ; e | f &

which comprises three conditionals, four pipelines, and six commands. (Exercise: What are the four pipelines? What are the three conditionals?)

In tree structure, that would look like this. (The list—the “linked list of conditionals”—is the whole first line.)

ShellExampleTree.png

(We’re not showing the multiple words and redirections that would be part of each command.)

Questions

  1. Sketch a set of C structures corresponding to this design.
  2. How can one traverse your C structures to decide which commands to run at which times?
  3. How can one traverse them to determine which conditionals should be executed in the foreground or the background?
  4. Given a command in a pipeline, how can one examine the command C structure to determine whether the command is on the left-hand side of a pipe?
  5. How about the right-hand side of a pipe?

Flat Linked List

Alternatively, we can create a single linked list of all of the commands. In this case, we also store the connecting operator (one of & ; \| && or \|\|). For our sample command line

a & b | c && d ; e | f &

that might look like this:

ShellExampleList.png

We now consider how we would parse and traverse this data structure.

Questions

  1. Write a set of C structures corresponding to this design.
  2. Given a command structure corresponding to the first command in a conditional, how can shell code determine whether the conditional should be executed in the foreground or the background?
  3. Given a command structure in a pipeline, how can shell code determine whether the command is on the left-hand side of a pipe?
  4. Given a command structure in a pipeline, how can shell code determine whether the command is on the right-hand side of a pipe?
  5. Sketch out code for parsing a command line into these C structures.

Sample solutions here.