In this assignment, you'll leverage fork
, exec
, pipe
, and several other
interesting system calls to build a command shell. You will implement simple
commands, conditional commands (&&
and ||
), pipes, background commands,
and redirections, as well as the incredible cd
command.
- Plan your work schedule! You should be done with Phases 1–3 by Saturday, 11/9. There’s no consequences for being late; just pace yourself.
- Due Thursday 11/14 11:59pm EST (one day later for DCE). (Please indicate your preferred grading commit using the “Grade this commit” button.)
Your shell implements a subset of the bash shell’s syntax, and is generally compatible with bash for the features they share. You may be interested in a tutorial for Unix shells.
Get the code
Start with the cs61-psets
Git repository you used for Problem Set
4 and run git pull handout main
to
merge our code, which is in the pset5
subdirectory, with
your previous work. If you have any “conflicts” from Problem Set 4, resolve them before continuing further. Run git push
to save
your work back to GitHub.
You may also create a new cs61-psets
repository for this assignment. Don’t
forget to enter your repository URL on the grading server.
Shell grammar
There are two main phases to writing a shell. The shell must parse commands from a command line string, and then it must execute those commands.
We’ve completed most of the parsing phase for you. The shell_tokenizer
object parses tokens from a command line, and can differentiate between normal
words like “echo
” and special control operators like “;
” or “>
”. The
shell_parser
object consumes tokens from shell_tokenizer
, but also
understands the shell grammar, and can determine where a
command (or pipeline or conditional) ends.
sh61 command lines obey the following grammar, which we explore in the first process section. Each input
line is a “commandline
” defined as follows:
commandline ::= (empty) | conditional | conditional ";" commandline | conditional "&" commandline conditional ::= pipeline | conditional "&&" pipeline | conditional "||" pipeline pipeline ::= command | pipeline "|" command command ::= word | redirection | command word | command redirection redirection ::= redirectionop filename redirectionop ::= "<" | ">" | "2>"
This grammar says, for example, that the command list “echo foo && echo bar & echo baz
” is parsed and executed as follows:
{ { echo foo } && { echo bar } } & { echo baz }
The &&
is “inside” the background command, so “echo foo && echo bar
” runs
in the background and “echo baz
” runs in the foreground. The overall command
list has two component conditionals, “echo foo && echo bar
” and “echo baz
”; these conditionals divide into three (trivial) pipelines, “echo foo
”,
“echo bar
”, and “echo baz
”; and each of those piplines contains a single
command.
A robust shell will detect errors in its input and handle them gracefully, but for this problem set, we promise that all the inputs we use in tests follow the grammar above.
About BNF grammars
Command execution
The main work of this assignment is implementing the shell.
If you’re confused about a shell feature and the tutorials and manuals don’t
help, run some experiments! The bash
shell, which is default on Docker, is
compatible with our shell. (The zsh
shell, which is default on Mac OS X, is
mostly compatible.)
See the commonly-used programs for testing ideas.
Run your shell by typing ./sh61
and entering commands at the prompt. Exit
your shell by typing Control-D at the prompt; the Control-D will cause
the shell to detect the end-of-file for the shell input.
Phase 1: Simple commands
You can implement features in whatever order you prefer, but our suggested order may help structure your work.
First, support simple commands like “echo foo
” by changing command::run
and run_list
. You’ll use at least the following system calls: fork
,
execvp
, and waitpid
. For guidance, consult:
- the man pages for
fork
,execvp
, andwaitpid
- the sections of the Process control notes that discuss fork, execvp, and
waitpid
- the function definitions in
sh61.hh
- the line notes for
command::run
andrun_list
insh61.cc
Testing note. In this pset, you will both parse parts of a command line, and make
run_list
and friends execute that command line. Most of your bugs will involve problems withrun_list
,command::run
, and helper functions you write, but it’s also possible to misuseshell_parser
—or to have undefined behavior. So, make sure you think through your testing strategy. Consider usingstrace
to trace the operation of your shell, for example! Here are some notes about how.
You may also need to exit a forked copy of the shell (for example, if execvp
fails). To exit a child process, call the _exit
system call. For instance,
call _exit(1)
or, equivalently, _exit(EXIT_FAILURE)
to exit with status 1.
Your code for this problem set should never call the
exit
function.exit
(without the underscore) is a libc library call, not a system call;exit
performs cleanup actions (e.g., messing with open stdio files) that shouldn’t happen in child processes—these actions should only happen in the parent shell. If you callexit
instead of_exit
from child processes, you may see weird behavior where, for example, the parent shell re-executes parts of a command file.
Run make check-simple
to test your work.
Phase 2: Command lists
Support foreground command lists, which are chains of commands linked by ;
(and optionally terminated by ;
). The semicolon runs two commands in
sequence—the first command must complete before the second begins.
This phase will require changes to run_list
, at least. You will start using
more of the functionality of shell_parser
, specifically
shell_parser::first_conditional
and shell_parser::next_conditional
.
Run make check-list
to check your work.
About
shell_parser
The
shell_parser
object understands the shell grammar and offers functions that navigate command lines by conditional, by pipeline, and by command. You’ll use more of its functions in almost every phase.Each
shell_parser
is associated with a contiguous region of the command line. Functions likenext_conditional
andnext_command
advance to the next region of a given type. To navigate within a region—for instance, to navigate over the conditionals in a command line—use functions likefirst_conditional
, which return sub-parsers specialized to the parent region.The important functions of
shell_parser
are:
- Truthiness test (
if (parser)
orwhile (parser)
): Returns true unless parser has reached the end of its parent region.- Tokens:
first_token()
returns ashell_tokenizer
iterating over the tokens in the region. Useshell_tokenizer::type()
,str()
, andnext()
to analyze the returned tokens.- Next operation:
op()
returns the token type, likeTYPE_SEQUENCE
orTYPE_AND
, of the next token within the parent region. (At the end of the parent region, it returnsTYPE_SEQUENCE
.) For example, in the conditionala && b || c
, theop()
s after the three pipelines areTYPE_AND
,TYPE_OR
, andTYPE_SEQUENCE
.- Navigate by command:
first_command()
returns a sub-parser for the first command in the current region;next_command()
advances this parser to the next command within its parent region.- Navigate by pipeline or conditional: Similarly,
first_pipeline()
andnext_pipeline()
navigate by pipelines, andfirst_conditional()
andnext_conditional()
navigate by conditionals.But the best way to see this work is by example.
Simplicity note. As you add more features to your shell, your code will naturally get more complicated. But try to resist unnecessary complication. Don’t copy whole blocks of code—avoid this pattern:
if (feature 1 is enabled) { step 1 of child process creation; step 2 of child process creation; step 3a of child process creation; step 4 of child process creation; step 5a of child process creation; } else if (feature 2 is enabled) { step 1 of child process creation; step 2 of child process creation; step 3b of child process creation; step 4 of child process creation; step 5b of child process creation; } else { step 1a of child process creation; step 2 of child process creation; step 3c of child process creation; step 4b of child process creation; step 5a of child process creation; }
Instead, aim for something like this:
step 1 of child process creation; step 2 of child process creation; if (feature 1 is enabled) { step 3a of child process creation; } else { step 3b of child process creation; } step 4 of child process creation; if (feature 1 is enabled) { step 5a of child process creation; } else if (feature 2 is enabled) { step 5b of child process creation; } ...
The second style is more robust to changes in requirements, because adding another feature requires local changes, as opposed to many distributed changes across slightly-different blocks of code.
Our solutions, with all features enabled, have two lines that call
fork()
, two lines that callwaitpid()
, and one line that callsexecvp()
.
Phase 3: Conditionals
Support conditionals, which are chains of commands linked by &&
and/or
||
. Each of these operators runs two commands, but the second command is run
conditionally, based on the status of the first command. For example:
$ true ; echo print
# The second command always runs, because ';' is an
# unconditional control operator.
print
$ false ; echo print
print
$ true && echo print
# With &&, though, the 2nd command runs ONLY if
# the first command exits with status 0.
# (WIFEXITED(status) && WEXITSTATUS(status) == 0)
print
$ false && echo print
# (prints nothing)
$ true || echo print
# With ||, the 2nd command runs ONLY if the first
# command (1) DOES NOT exit, or (2) exits with status
# nonzero. || is the opposite of &&.
# (prints nothing)
$ false || echo print
print
The &&
and ||
operators have higher precedence than ;
and &
, so
a command list can contain many conditionals. &&
and ||
have the
same precedence and they associate to the left. The exit status of a
conditional is taken from the last command executed in that conditional.
For example, true || false
has status 0 (the exit status of true
)
and true && false
has exit status 1 (the exit status of false
).
To support conditionals, you’ll probably make changes to run_list
and
struct command
.
You’ll also use the WIFEXITED
and
WEXITSTATUS
macros defined in man waitpid
.
Run make check-cond
to check your work.
Phase 4: Pipelines
Support pipelines, which are chains of commands linked by |
. The pipe
operator |
runs two commands, connecting the standard output of the left
command to the standard input of the right command. Unlike conditionals and
lists, the commands in a pipeline run in parallel. The shell starts all
the pipeline’s commands concurrently.
The |
operator has higher precedence than &&
and ||
, so a conditional
can contain several pipelines. The exit status of a pipeline equals the exit
status of its last command.
If a foreground pipeline has two or more commands, your shell must wait at least for the last command to complete. Normal shells, such as Linux
bash
, wait for all commands in the pipeline to complete before moving on (though only the last command’s status matters); your shell may wait for these other commands to complete, but it is also OK to just wait for the last command.
To support pipelines, you’ll need to use some new system calls—namely pipe
,
dup2
, and close
—and change command::run
, run_list
, and struct command
.
See the Process control notes that discuss pipe and pipe in a shell.
Run make check-pipe
to check your work.
Phase 5: Background
Now add support for background commands, such as sleep 1 &
. A command or
conditional chain is “backgrounded” when the parent shell does not wait for it
to complete. Thus, the parent shell can accept and run new command lines even
as background commands execute.
Backgrounding will require changes to run_list
. The &
background operator has the same
precedence as ;
.
Depending on your command representation, backgrounding support may also
require your command::run
function look ahead into later command structures;
see section.
You must support simple background commands, such as sleep 10 &
, and you
must support background conditional chains, such as sleep 10 && echo foo &
.
Run experiments to see how background conditional chains work. For instance,
try:
$ sleep 10 && echo foo & echo bar
You can also use strace
to explore how the default Linux shell handles
background conditional chains. Try:
$ strace -e trace=%process -o strace.out -f sh -c "sleep 10 && echo foo & echo bar"
See our testing notes for information about
process-related strace
s.
Run make check-bg
to test your work.
Background conditional hints
Every conditional chain must be managed by a shell process that collects command exit status values and implements logic for
&&
and||
. But the parent shell process cannot manage a background conditional chain! Background conditional chains execute in parallel, alongside the parent shell (which is busy doing other things). This means a new shell process is required to manage the background conditional chain.Can you think of a convenient system call that can create a clone of the parent shell process, so that the clone could manage a background conditional chain? We sure hope so.
Cloned shell processes, often called subshells, are useful for many advanced shell features, including background conditional chains. A subshell should handle only a limited set of commands and then
_exit
; otherwise it would interfere with the parent shell. Here, a subshell used for a background conditional chain should_exit
after that conditional chain completes.
Phase 6: Zombie processes
Your shell should eventually reap all its zombie processes using
waitpid
. The waitpid
man page contains some helpful information on zombie processes.
Hint: You must reap all zombies eventually, but you don’t need to
reap them immediately. We don’t recommend using signal handlers to
reap zombies, since a signal handler can interfere with the waitpid
calls used to wait for foreground processes to complete. A well-placed
waitpid
loop will suffice to reap zombies; where should it go?
Run make check-zombie
to test your work.
Phase 7: Redirections
Now support redirections, where some of a command’s file descriptors are sent to disk files. You must handle three kinds of redirection:
< filename
: The command’s standard input is taken fromfilename
.> filename
: The command’s standard output is sent tofilename
.2> filename
: The command’s standard error is sent tofilename
.
For instance, echo foo > x
writes foo
into the file named x
.
shell_token_iterator
represents redirect operators, such as <
, >
and 2>
, as
type TYPE_REDIRECT_OP
. Each redirection operator
must be followed by a filename, which is a TYPE_NORMAL
token. You’ll also
change command::run
to set up the redirections, using system calls open
,
dup2
, and close
.
The shell sets up a command’s redirections before executing the command. If a redirection fails (because the file can’t be opened), the shell doesn’t actually run the command. Instead, the child process that would normally have run the command prints an error message to standard error and exits with status 1. Your shell should behave this way too. For example:
$ echo > /tmp/directorydoesnotexist/foo
/tmp/directorydoesnotexist/foo: No such file or directory
$ echo > /tmp/directorydoesnotexist/foo && echo print
/tmp/directorydoesnotexist/foo: No such file or directory
$ echo > /tmp/directorydoesnotexist/foo || echo print
/tmp/directorydoesnotexist/foo: No such file or directory
print
How to figure out the right error message? Try man strerror
!
Hint: Your calls to open
will have different arguments depending on what
type of redirection is used. How to figure out what those arguments are? Well,
you could use the manual page (man 2 open
); or you could use strace
to check the regular
shell’s behavior. Try this:
$ strace -o strace.txt -f sh -c "echo foo > output.txt"
Look in strace.txt
. Which flags were provided to open
(or a variant, such
as openat
) for output.txt
? Try this with different redirection types.
Run make check-redir
to test your work.
Phase 8: The cd
command
Finally, your shell should support the cd directory
command. The cd
command is special; why?
Run make check-cd
to test your work.
(You must support cd
commands that are part of conditionals, as in cd foo && echo bar
, but you do not need to support cd
commands that are part of
a multi-process pipeline.)
Checking your work
Use make check
to check your work; make SAN=1 check
to check your work
with sanitizers; and make SAN=1 LSAN=1 check
to check your work with
sanitizers and memory leak detection.
You may also run make check-TEST
to run a specific test,
or (for example) make check-simple
to run all the SIMPLE
tests. As always,
it can be great to create your own tests!
Extra credit
There are numerous ways you can extend your shell. For instance, you can add support for:
Interruption. See our shell interruption notes.
Complex redirections. Our parsing code understands more redirections than your code is required to support. Add support for more redirections.
Subshells. A subshell adds the following production to the grammar:
command ::= "(" list ")" [redirection]...
This executes the list
in a grouped subshell—that is, a child shell
process. All the commands in the subshell may have their file
descriptors redirected as a group.
Variable substitution. See here.
Control structures. Design and implement analogues of the if
,
for
, and while
control structures common to many programming
languages. For example, your if
structure should execute several
commands in order, but only if some condition is true—for example, only
if a command exits with status 0.
Shell functions. Design and implement a way for shell users to write
their own “functions.” Once a function f
is defined, typing f
at the
command line will execute the function (rather than searching for an executable
named f
). For example, the user might write a function echo_twice
that
printed its arguments twice, by running the echo
command twice. Discuss
how other command line arguments will be passed to the shell function.
Or anything else that strikes your fancy. Read up about existing shells (bash, zsh, dash, Windows PowerShell, etc.) for ideas.
Turnin
Hand in your code by editing README.md
and AUTHORS.md
, committing
your changes, and pushing the result to GitHub. Please indicate which commit is your final submission through your commit message and/or a flag on the grading server.
Common shell utilities
Here are some commonly-installed, commonly-used programs that you can call
directly from the shell. You may find them useful for testing. Documentation
for these programs can be accessed via the man
page, e.g. man cat
, or
often also through a help switch, e.g. cat --help
.
Shell Program | Description |
---|---|
cat |
Write named files (or standard input) to standard output. |
wc |
Count lines, words, and characters in named files (or standard input). |
head -n N |
Print first N lines of standard input. |
head -c N |
Print first N characters of standard input. |
tail -n N |
Print last N lines of standard input. |
echo ARG1 ARG2... |
Print arguments to standard output. |
printf FORMAT ARG... |
Print arguments with printf-style formatting. |
true |
Always succeed (exit with status 0 ). |
false |
Always fail (exit with status 1 ). |
sort |
Sort lines in input. |
uniq |
Drop duplicate lines in input (or print only duplicate lines). |
tr |
Change characters; e.g., tr a-z A-Z makes all letters uppercase. |
ps |
List processes. |
curl URL |
Download URL and write result to standard output. |
sleep N |
Pause for N seconds, then exit with status 0 . |
cut |
Cut selected portions of each line of a file. |
grep PATTERN |
Print lines in named files (or standard input) that match a regular expression PATTERN . |
tac |
Write lines in named files (or standard input) in reverse order to standard output. |
This lab was originally created for CS 61, but every course has its own shell lab.