From CS61
Jump to: navigation, search

11/10: The Sieve of Eratosthenes

Today's exercise will walk you through building a pretty cool implementation of a prime number generator. The algorithm that we'll use is called the Sieve of Eratosthenes.

Before we start coding, let's get a feel for how the algorithm works.

Write down the numbers 2 through 25, in order. Repeat the following steps until all numbers are either circled or crossed out.

1. Circle the first number that is neither circled nor crossed out.

2. Cross out all numbers after that number that are multiples of the number you just circled (i.e., when you circle 2, you will cross out all the even numbers).

You should be left with all the prime numbers less than 25 circled.

Now we'll build a program to do this (somewhat inefficiently, but in a particularly entertaining manner, giving you practice with fork, exec (maybe not), and pipe.

seq: a good start

You'll find that there is a program on your system called seq that generates a stream of numbers. Check out the man page for seq and prove to yourself that you can create a stream of integers that we'll feed into our sieve program.

sieve: building a simple sieve

Our sieve should read the first number it gets on stdin, output that number to stderr (it should be a prime number), and then repeatedly read a number from stdin and if that number is not a multiple of that first number output it to stdout. For example, given an input stream of "2 3 4 5 6 7 8 9" your program should read the 2, and output any numbers that are not a multiple of 2, producing "3 5 7 9".

We placed the skeleton of a program in sieve.c in cs61-exercises/shell3x. (It contains all the includes that you'll need for the rest of the exercise.) Complete the code in main to process standard in as described above. (You should be able to use scanf/printf to read and write the stream.)

When this is working, you should get something like this:

shell3x% seq 2 10 | ./sieve
2 3 5 7 9

The 2 should have been written to stderr while everything else should have been written to stdout. You can verify that by doing:

shell3x% seq 2 10 | ./sieve > xxx

In this case, you should see the 2 print out (because we have not redirected stderr, but 3 5 7 9 should appear in the file xxx.

Notice that you can pipe sieves together:

shell3x% seq 2 10 | ./sieve | ./sieve 
2 3 5 7

It would be pretty tedious if we had to keep manually adding stages to our pipe to produce new prime numbers...

Letting fork and exec do the work for you

Modify your current sieve program so that after it reads the first prime number and before it begins processing the rest of the input stream, it forks and execs a new copy of sieve whose stdin is connected to the stdout of the current process. Hint: This is exactly what the programs pipe2 and pipe3 from today's lecture do.

If you do this correctly, you will automatically create a pipeline with as many processes as there are prime numbers in the sequence of numbers you generate using seq. Each process generates one prime (to stderr) and then filters all subsequent multiples of that prime.

Beware: If you do not properly check that you have data in your input stream, it's possible to inadvertently create a forkbomb (so you might want to do this in your virtual machine ...).

Letting fork do this without exec

You may have noticed that you are calling exec on the exact same program you are running. Given that, it seems that perhaps we should be able to implement this using only fork. Indeed you can. Give that a try. It's a bit trickier, due to the fact that we create pipes using file descriptors, but if you are using stdio, there is data cached that doesn't disappear when you dup file descriptors.

You might find the following man pages useful:

man fdopen
man fpurge

When you're done

Please complete this survey

Solution video