Introduction
Today's section:
- Reviews
exec
,fork
, anddup2
system calls - Covers piping
Pull the latest version of the sections repository by entering in the terminal
git pull
The cs61-section/s11
directory contains the files for the exercises
throughout this section.
System Calls
fork
pid_t fork(void);
- Creates a new child process
- Copies all the memory of the parent to the child
- Note that in the virtual memory assignment, you will copy all user-accessible, writeable memory in the parent to the child. Most OSs use copy-on-write, where all writeable memory is marked as readonly in both parent and child, and then copied when either parent or child attempts to write to it.
- Copies the parent’s file descriptor table to the child. This means the child has the same files open as the parent. Also, these open files share state, so if the parent seeks in one of these open files, the seek will affect both parent and child.
- Starts child process and returns 0.
- Resumes parent process and returns the PID of the child.
Try to create a basic conditional skeleton for fork
before looking
at the one below.
exec
int execvp(const char \*file, char \*const argv[]);
- Takes in a file (either a path to a binary like “./prog” or something like “ls”, which it will look the path to), and a NULL terminated array of arguments to the program (like main’s argv, with argv[argc] (the end of the array) == NULL)
- Executes the given program, replacing the current memory of the calling program with the memory for the given program
- On success, execvp will not return, because the calling program “turns into” the program being executed.
If a command is passed as an argument to the program, what do we add
to our skeleton to run that command in a child process? Assume that we
are only passing a command with no argument (e.g. ls
but not wc -c asdf.txt
)
waitpid
pid_t waitpid(pid_t pid, int \*status, int options)
- Simple case: waitpid(pid, NULL, 0);
- Returns when the process with process id pid exits
- We can pass in various values in as the last argument to vary the behavior of waitpid. A flag of particular interest may be WNOHANG, which allows waitpid to return immediately without waiting.
- We can also pass in an int status by reference to get various information about the process after it exits. For instance the following will print the child’s exit status (i.e. the return value of main or the number passed to an exit() call:
int status;
waitpid(pid, &status, 0);
// make sure it exited normally (vs e.g. being killed)
if (WIFEXITED(status))
printf(“%d\n”, WEXITSTATUS(status));
Make the parent in our skeleton wait for the child process to finish running and then print "done"
dup2
int dup2(int oldfd, int newfd);
- Makes newfd a copy of oldfd (closing the current newfd if necessary)
- i.e. reading and writing from newfd will read and write from the file referred to by oldfd.
What if we want to redirect the stdin and stdout of the program we are
running in the child process to use an infile and outfile that we pass
as arguments to our program? (e.g. ./prog wc in.txt out.txt
should
output the character, word, and line counts of in.txt to the file
out.txt)
Piping
Every pipe has a 'read' end and a 'write' end. Before you implement your pipe, make sure you know which side of the pipe is being used in each process!
pipe System Call
int pipe(int pipefd[2]);
- Accepts an uninitialized array of 2 ints
- After calling pipe, the first element of the array pipefd[0] will be the “read” end of the pipe, and the second element pipefd[1] will be the “write” end of the pipe. So anything written to pipefd[1] can then be read from pipefd[0].
Instead of an infile and an outfile, we want to take text that will be passed to the parent and pipe it to the stdin for the child process.
Exercise: Process Management
Here's the skeleton of a shell function implement a simple two-command
pipeline, such as cmd1 \| cmd2
void simple_pipe(const char* cmd1, char* const* argv1, const char* cmd2, char* const* argv2) {
int pipefd[2], r, status;
[A]
pid_t child1 = fork();
if (child1 == 0) {
[B]
execvp(cmd1, argv1);
}
assert(child1 > 0);
[C]
pid_t child2 = fork();
if (child2 == 0) {
[D]
execvp(cmd2, argv2);
}
assert(child2 > 0);
[E]
}
And here are some system calls:
[1] close(pipefd[0]);
[2] close(pipefd[1]);
[3] dup2(pipefd[0], STDIN_FILENO);
[4] dup2(pipefd[0], STDOUT_FILENO);
[5] dup2(pipefd[1], STDIN_FILENO);
[6] dup2(pipefd[1], STDOUT_FILENO);
[7] pipe(pipefd);
[8] r = waitpid(child1, &status, 0);
[9] r = waitpid(child2, &status, 0);
Your task is to assign system call IDs, such as "1", to slots, such as "A", to achieve several behaviors, including a correct pipeline and several incorrect pipelines.
- Implement a correct foreground pipeline.
- Implement a pipeline so that, given arguments corresponding to
echo foo \| wc -c
, the wc process reads “foo” from its standard input but does not exit thereafter. - Implement a pipeline so that, given arguments corresponding to
echo foo \| wc -c
, “foo” is printed to the shell’s standard output and the wc process prints “0”. (In a correctly implemented pipeline,wc
would print 4, which is the number of characters in “foo\n”.) - Implement a pipeline that appears to work correctly on
echo foo \| wc -c
, but always blocks forever if the left-hand command outputs more than 65536 characters - Implement a pipeline so that, given arguments corresponding to
echo foo \| wc -c
, bothecho
andwc
report a “Bad file descriptor” error. (This error, which corresponds to EBADF, is returned when a file descriptor is not valid or does not support the requested operation.)
Exercise: The Sieve of Eratosthenes
Today's exercise will walk you through building a pretty cool implementation of a prime number generator. The algorithm that we'll use is called the Sieve of Eratosthenes.
Before we start coding, let's get a feel for how the algorithm works.
Write down the numbers 2 through 25, in order. Repeat the following steps until all numbers are either circled or crossed out.
1. Circle the first number that is neither circled nor crossed out.
2. Cross out all numbers after that number that are multiples of the number you just circled (i.e., when you circle 2, you will cross out all the even numbers).
You should be left with all the prime numbers less than 25 circled.
Now we'll build a program to do this (somewhat inefficiently, but in a particularly entertaining manner, giving you practice with fork, exec, and pipe.
seq: a good start
You'll find that there is a program on your system called seq
that
generates a stream of numbers. Check out the man page for seq
and
prove to yourself that you can create a stream of integers that we use
in our sieve program.
filtermultiples: remove multiples
The first step of the sieve is removing multiples of the first prime, 2.
filtermultiples
is a program that takes in a number as an argument
for which it will remove all multiples of that number from stdin. For
example, running ./filtermultiples 2
and then typing 2 3 4 5 6 7
(pressing enter after each number) will yield
3
5
7
because we are removing all multiples of 2 from the arguments 2 3 4 5 6 7
.
We have provided a skeleton in filtermultiples.c
that will read each
number in one by one. Your job is to write the rest of
filtermultiples.c
so that it outputs the subset of numbers that are
not multiples of the first number read in.
Let's say we want to output all the numbers from 2 to 1000000 that are
not multiples of 2. We don't want to type out every number from 2 to
1000000, so we can use seq
to pipe the numbers 2 through 1000000 to
./filtermultiples 2
and output all of the numbers that are not
divisible by 2 as follows
seq 2 1000000 | ./filtermultiples 2
What if we want to remove all multiples of 2 and 3?
seq 2 1000000 | ./filtermultiples 2 | ./filtermultiples 3
We can keep piping with subsequent primes to remove more and more composite numbers.
Make sure you understand why this works before moving on
sieve: building a simple sieve
Our sieve should output all prime numbers from 2 to 1000000. We have
included a skeleton in sieve.c
that already gets the numbers from 2
through 1000000 for you. It also has the basic sieve structure for
taking the first number that is not a multiple and assigning it as the
next prime number. Your job is to write the function add_filter
that
will take a stream of numbers that are not multiples of the last prime
and output a stream of numbers that are not multiples of the new prime.
Hint: you will need to use pipe
, fork
, dup2
, and exec
!
Think about how you want to redirect stdin and stdout and where you want
them to be redirected.
After writing add_filter
, running ./sieve
should output all
prime numbers from 2 through 1000000!