Problem set 4 – CS 61 2023

In this problem set, you’ll gain experience with caching by writing your own buffered I/O library.

5 pt intermediate check-in: phases 1-4 done by Fri 11/3 (please name your checkin commit "checkin".)
Due TUESDAY 11/7 11:59pm EDT (please indicate which commit is your final submission through your commit message and/or a flag on the grading server.) Note that you have less time for this problem set than you did for previous ones!
Reminder: You may not use code or solutions from previous years.

Get the code

Get our code with

$ git pull; git pull handout main

or, alternately

$ git pull; git pull https://github.com/cs61/cs61-f23-psets.git main

This will merge our Problem Set 4 code with your previous work. If you have any “conflicts” from prior problem sets, resolve them before continuing further. Run git push to save your work back to your personal repository.

You may also create a new cs61-psets repository for this assignment. Don’t forget to enter your repository URL on the grading server.

Goal

Our simple IO61 library performs I/O on files. You will find the code in the pset directory in the file io61.cc. Our version of IO61 is pretty stupid—it uses byte-at-a-time system call I/O and is thus quite slow. Your goal is simple: speed it up using caching.

We also distribute several programs that use IO61. (The grading server has some secret extra programs too, and it has tests that use our handout programs in different ways.)

cat61 reads a file sequentially one byte at a time and writes it to standard output.
blockcat61 is like cat61, but reads the file sequentially one block at a time. Its -b BLOCKSIZE argument sets the block size.
randblockcat61 is like blockcat61, but it chooses a random block size for each read and write. It still reads and writes sequentially (only the block sizes are random). The -b argument is the maximum possible block size. This may very well stress your caching solution.
reverse61 reads a file a byte at a time in reverse and writes the result to standard output.
stride61 reads its input file using a strided access pattern, writing the data it reads sequentially. By strided, we mean that it begins at offset 0, reads BLOCKSIZE bytes, seeks to position STRIDE, reads another block, seeks to position (2 * STRIDE), etc. When it gets to the end of the file, it jumps back to the first bytes not yet read and repeats the pattern. The -b and -s arguments set the block size and stride length, respectively.
reordercat61 reads its input file one block at a time in random order and writes each block to the output file at the correct position. The result is a copy of the input file, but created non-sequentially.
scattergather61 can take two or more input files and two or more output files. It reads blocks (or lines) from the input files in round-robin order, and writes them to the output files in round-robin order. This will test your ability to handle multiple input and output files.

And there are others.

You will introduce caching to the io61_file abstraction and use your cache to speed up IO61 operations. We’re giving you tons of freedom to implement the cache as you like. You may even use memory-mapped I/O*, prefetching system calls like madvise or posix_fadvise, or multiple threads or processes (although none of these are required). But you may not use another buffered I/O library or caching I/O library.

Your library should:

Return correct results for all access patterns and file types, including multiple files, very large files, non-seekable files, files that return short reads or accept short writes, mixtures of calls (for instance, alternating io61_read and io61_readc), and random accesses.
Perform as well as stdio on sequential reads and writes for any files, including very large files (such as files that don’t fit in memory) and non-seekable files.
Improve on stdio’s performance for some non-sequential access patterns, such as simple strides or reverse access.

Code will be graded based on performance on the grading server.

All your code for the main pset should fit in io61.cc.

*Note that memory-mapped I/O is not always possible, so your old block cache should be used for pipe files and other non-mappable files. This should make your code beat stdio on some non-sequential I/O tests!

Evaluation

We will evaluate you based on your code’s performance relative to a version of IO61 that uses stdio, as well as relative to the performance of other students' code. That version of IO61 that uses stdio is provided for you in stdio-io61.cc. (You can build a stdio test with make stdio-cat61 and so forth.)

Run make check to check your current implementation on a battery of tests and print summary statistics at the end. The tests are divided into groups:

The correctness tests (tests C1, C2, etc.) stress-test your code for common errors. Performance on these tests is not important. As always, add assertions to your code, and we’ll give extra credit for tests of your own.
The medium-file sequential performance tests (tests MSEQ*) perform sequential I/O on 5MiB files.
The medium-file non-sequential performance tests (tests MNONSEQ*) perform non-sequential I/O (e.g., reverse, random, or stride) on 5MiB files.
The large-file sequential performance tests (tests LSEQ*) perform sequential I/O on 64MiB files.
The large-file non-sequential performance tests (tests LNONSEQ*) perform sequential I/O on 64MiB files.
The secret tests (tests SECRET*) are only on the grading server.

Run make check-TESTS (e.g., make check-mseq or make check-mseq1-3) to run selected tests.

We may update the tests during the pset release period. We may also run additional tests during grading. In all instances, correctness matters most. A program that runs quickly but incorrectly is worse than a program that runs slowly but correctly.

This problem set is worth 105 points.

5 pts are given for the intermediate check-in.
78 pts are given based on your performance relative to stdio.
12 points are “mastery” points, which are given based on your performance relative to other students. You get mastery points for a test when 2 or more of your runs for that test are in the fastest 10% of runs over the class.
10 points are for style.

Roadmap

It’s easy in this problem set to design something too complex and get yourself stuck. Avoid that problem by tackling the simple cases first. Here’s a recommended roadmap.

Phase 1

You can improve the way io61_read and io61_write work without introducing a cache. Use strace to investigate the operation of blockcat61. Look at the blockcat61 code; what system calls do you think it should make? Then run strace -o strace.out ./blockcat61 blockcat61.cc. (That command line tells strace to write its output to strace.out, so examine strace.out with a command like less strace.out or through VS Code. If you're having trouble running strace sure that you're in Docker and that you've previously run the command make.) What system calls does blockcat61 actually make?

This will improve your performance on the block I/O tests (e.g., make check-mseq2). Also check that your code still passes the correctness tests (make check-c).

As you work on improving the performance of your code on sequential access patterns, you may want to ignore non-sequential access patterns. This will mean your code will produce incorrect results for non-sequential tests like C12, C13, and C14. That’s OK as long as you fix it later!

Phase 2

Implement a single-slot cache buffer for sequential reads. This will hold bytes [N, N+B) of the file, where B is some largish number (try different numbers). Any read request that lies within that range of bytes can be satisfied without making a system call. Check your work by examining strace output.

This will improve your performance on byte I/O tests (e.g., make check-mseq1).

After you've implemented your cache buffer for reading, consider a special case for reading one character at a time from your cache in io61_readc. Can we avoid calling io61_read if we only want to read a single byte?

Phase 3

Extend your single-slot cache buffer to also support sequential writes.

This will improve your performance on byte I/O tests. If you do this well, you should now match or beat stdio’s performance on all sequential I/O tests!

Phase 4

Fix your code to handle seeks correctly (but not necessarily in a high-performance way).

This should make your code produce correct results for all tests.

Phase 5

Change your code to continue handling seeks correctly, but with good performance for reverse61. For ideas, try running strace on the stdio-reverse61 variant. What does it look like stdio is doing?

This should make your code both roughly equal stdio’s performance and produce correct results on all tests.

Phase 6

Finally, it’s time to outperform stdio on some tests! Perhaps a bigger cache buffer will outperform stdio; perhaps memory-mapped files will help. If you have worked through the previous phases, you will have ideas of your own.

Hints and troubleshooting

Read our file descriptors notes if you are confused about file descriptor system calls.

You will likely need to change most of the functions in io61.cc.

Stdio is a well-written package! It is OK if you can’t always beat it, especially for sequential I/O.

Write your own tests! This will help you shake out bugs in your code, particularly correctness bugs. What diabolical things can you think of to try? You can add new tests pretty easily; just edit GNUmakefile and check.pl.

If a program using your library produces incorrect results, it can be hard to figure out whether the problem is with reads, writes, or both. For instance, perhaps your read cache is wrong, so the program is reading incorrect data; or perhaps the write cache is wrong, so the the program is trying to write correct data, but the data passed through to the kernel via system calls is wrong. To distinguish these scenarios, you can use strace to check things like file positions. You can also use the read61, write61, blockread61, and blockwrite61 programs. These behave like cat61 and blockcat61, but they use stdio for half their tasks. The read variants use IO61 for reading and stdio for writing, whereas the write variants use IO61 for writing and stdio for reading. If cat61 and write61 produce bad results but read61 on the same workload produces correct results, you can be pretty sure the problem is with writes.

You may make a couple assumptions:

You may assume that any file descriptor passed to io61_fdopen has its file pointer set to 0 (the beginning of the file).
You don’t need to set the file pointer after every call to io61_seek.

Add assertions in your code, since they can help you check correctness. However, assertions do take time to check, which could leave your code at a disadvantage relative to stdio. We will check the performance of your code with assertions disabled by running make NDEBUG=1 check-TEST.

If your code is slow, consider (1) the system calls it makes and (2) its CPU performance. For system calls, use strace to check: is a mistake in your code accidentally causing expensive system calls, such as many read calls that read one byte each? For CPU performance, consider whether there are functions that can avoid some extra work. It might pay off to have specialized io61_readc and io61_writec calls, for example.

Make targets

Your code must work correctly with sanitizers turned on (though it will not be very fast). Run make SAN=1 check to check for errors.

To get an strace log for your code on a specific test, use make STRACE=1 check-TEST. The strace output will be stored in strace.out.

To run without running the stdio tests, use make NOSTDIO=1 check.

To run with one trial rather than multiple trials, use make TRIALS=1 check.

To compile without optimization (which might help you debug), run make O=0.

To disable assertions, use make NDEBUG=1 check.

Extra Credit

More tests. Write your own tests that cover access patterns or scenarios that are different from our tests.
Associative cache. If you have additional time and feel ambitious, you can implement an associative cache that holds multiple blocks and works for (a) non-mappable files or (b) very large memory-mapped files. Something like this will get the highest performance. It is difficult and rewarding to implement an associative cache. You’ll learn a lot! But design carefully so you don't make your other code more difficult to read and maintain. In fact, you might try working on a separate branch initially.

Turnin

You will turn in your code by pushing your git repository to github.com and updating the grading server with your repository.

Please flag the commit on the grading server that you'd like to be graded.

Don’t forget to fill out README.md and AUTHORS.md.

This pset was originally created for CS61.

Problem set 4: Stdio