11/10: The Sieve of Eratosthenes
Today's exercise will walk you through building a pretty cool implementation of a prime number generator. The algorithm that we'll use is called the Sieve of Eratosthenes.
Before we start coding, let's get a feel for how the algorithm works.
Write down the numbers 2 through 25, in order. Repeat the following steps until all numbers are either circled or crossed out.
1. Circle the first number that is neither circled nor crossed out.
2. Cross out all numbers after that number that are multiples of the number you just circled (i.e., when you circle 2, you will cross out all the even numbers).
You should be left with all the prime numbers less than 25 circled.
Now we'll build a program to do this (somewhat inefficiently, but in a particularly entertaining manner, giving you practice with fork, exec (maybe not), and pipe.
seq: a good start
You'll find that there is a program on your system called seq
that
generates a stream of numbers. Check out the man page for seq
and
prove to yourself that you can create a stream of integers that we'll
feed into our sieve program.
sieve: building a simple sieve
Our sieve should read the first number it gets on stdin
, output that
number to stderr
(it should be a prime number), and then repeatedly
read a number from stdin
and if that number is not a multiple of
that first number output it to stdout
. For example, given an input
stream of "2 3 4 5 6 7 8 9" your program should read the 2, and output
any numbers that are not a multiple of 2, producing "3 5 7 9".
We placed the skeleton of a program in sieve.c
in
cs61-exercises/shell3x
. (It contains all the includes that you'll need
for the rest of the exercise.) Complete the code in main to process
standard in as described above. (You should be able to use
scanf/printf
to read and write the stream.)
When this is working, you should get something like this:
shell3x% seq 2 10 | ./sieve
2 3 5 7 9
The 2 should have been written to stderr
while everything else should
have been written to stdout
. You can verify that by doing:
shell3x% seq 2 10 | ./sieve > xxx
In this case, you should see the 2 print out (because we have not
redirected stderr
, but 3 5 7 9 should appear in the file xxx.
Notice that you can pipe sieves together:
shell3x% seq 2 10 | ./sieve | ./sieve
2 3 5 7
It would be pretty tedious if we had to keep manually adding stages to our pipe to produce new prime numbers...
Letting fork and exec do the work for you
Modify your current sieve program so that after it reads the first prime
number and before it begins processing the rest of the input stream, it
forks and execs a new copy of sieve whose stdin
is connected to the
stdout
of the current process. Hint: This is exactly what the programs
pipe2
and pipe3
from today's lecture do.
If you do this correctly, you will automatically create a pipeline with
as many processes as there are prime numbers in the sequence of numbers
you generate using seq
. Each process generates one prime (to stderr
)
and then filters all subsequent multiples of that prime.
Beware: If you do not properly check that you have data in your input stream, it's possible to inadvertently create a forkbomb (so you might want to do this in your virtual machine ...).
Letting fork do this without exec
You may have noticed that you are calling exec
on the exact same
program you are running. Given that, it seems that perhaps we should be
able to implement this using only fork
. Indeed you can. Give that a
try. It's a bit trickier, due to the fact that we create pipes using
file descriptors, but if you are using stdio
, there is data cached
that doesn't disappear when you dup file descriptors.
You might find the following man pages useful:
man fdopen
man fpurge
When you're done
Please complete this survey