Building a Read Cache
- Explain how read performance changes in the presence of a cache
- Observe the effects of the differt caches available in libraries and the operating system
- Build a simple read cache.
- Comfortably Tackle Assignment 3
On Thursday you observed the effects that caching makes when we write files. Today, we will first observe the effects of caching on reads, and then we'll build our own simple cache. The techniques and even some of the code you write today should help you get started with assignment 3.
Today's code is in cs61-exercises/l10. When you first type
make you should see all the r0N benchmarks build, but
you'll get complaints that r01-cache and randcache don't build. That's fine.
You will find a collection of read benchmarks named r01-r09. Each one will read
data and print out performance numbers.
randread for now.)
diff to determine how the benchmarks are different.
(You will undoubtedly want to compare byte benchmarks to byte benchmarks and
block benchmarks to block benchmarks.)
Then predict how the performance of each benchmark will compare to the others
given the differences you've observed -- force yourself to make comparisons between
diferent variations of byte benchmarks, between byte and block benchmarks, and between
any others where you might reasonably be able to form a decent hypothesis.
Write down your predictions in a file named
Finally, run each benchmark, recording the final performance.
IMPORTANT: You should run
sudo ./dropcaches before each test.
This program causes the operating system to clear out its file cache, so that
each benchmark starts in the same state. What do you suppose would happen if
you didn't do this? (Put that answer in your
analysis.txt file too.)
Based on the results you obtained above, formulate hypotheses about how the stdio and file system buffer caches work. (You need write only a couple of sentences, but indicate which benchmark results you used to make your hypotheses.)
Building a Cache
Next, we're going to see if you can build a cache that will make r01-byte.c perform as well as r05-stdiobyte.c.
If you are feeling confident about all this material and want a real challenge;
stop reading here and go to town building a cache.
The only change you may make to r01-byte.c is to change the
read(fd, buf, 1) and
open("data", O_RDONLY) calls
to call your routines.
OK, that's a bit of an overstatement -- you may also change the error message
and add include statements.
Your read equivalent must return the next byte in the file.
You should build your cache so that it can effectively handle the workload
Don't worry if you don't feel prepared to tackle this without any guidance -- that's what we are here for. You will be able to do this -- just follow our advice below. Pay attention to the methodology we use walking through this implementation. At each step, we strive to have a working system that you can test to make sure it's doing what you think. We encourage you to approach your problem sets this way. All too often, it is tempting to write a large pile of code and only when you've written more code than you can possibly keep in your brain are you ready to build it and see if it works. Instead, try making small incremental changes that you can test as you go.
You will notice that we've created two files for you:
In these files, we've declared two functions,
cache_file_read, that replace the
read system calls.
- Recording the performance you get before modifying
r01-cache.cto call the caching routines instead of the system calls.
r01-cacheand making sure it runs (since the cache routines currently do nothing, you'll get 0 bytes read -- that's fine).
Let's start by making our caching version behave identically to the
original program. How would you do that?
Implement that and make sure you get nearly the same performance from
(Don't forget to run
dropcaches before running any benchmark.)
Now, what would we really like to happen when a program calls our
We'll begin by trying to get get
r01-cache to get the same performance we
It would seem that achieving that performance is going to require that we
make fewer calls to
Making fewer calls to
read requires that we read more bytes each
time, and reading more bytes each time suggests that we need a place to
store bytes that we've read but not yet returned to the caller of
OK, this seems to suggest that we're going to need a data structure that
we associate with an open file, and we're going to need a way to associate
that data structure with the file.
Recall that file descriptors are small integers. You may assume that your
cache will never be called with a file descriptor greater than 10 (but
you should still code defensively against bad inputs -- check out the
manual page for
TODO: Design a data structure that you'll use to keep track of data that
you've read but not yet returned to the caller. Place your data structure declaration and
cache.c. Be sure to design a way to
find the data structure corresponding to a particular file when
cache_file_read is called.
(Hint: recall that we said you may assume that you'll never get a file descriptor larger than 10.)
Make sure that your declarations are correct by confirming that
r01-cache still builds.
Next you will need to create and/or initialize your data structures during open. The precise
code you write will vary depending on your system design, but you should
now be ready to write the complete implementation of
TODO: You guessed it, write
file_cache_open. Make sure your program
r01-cache still builds and runs.
You're almost there!
Now you have to do the actual caching.
This means that inside of
file_cache_read, you'll want to:
- Check whether you have cached data you can immediately return.
- If you do not have data to immediately return, you'll need to perform a
readto get some data.
- You'll want to update your data structures.
- You'll want to return the data to the caller.
TODO: Go for it! See if you can write
When you have something you think is right, compare its performance to
How did you do? Can you explain any differences you observed?
You might find the UNIX
time command useful in interpreting
your data. (You might want to consult the man page to understand
the different numbers that
What you've probably just built is a single slot cache. That is, you have only a single cache entry (per file, perhaps). Even so, you should find that it improves performance significantly.
Congratulations! When you begin assignment 3, think back to this exercise -- you may reuse any code or ideas you used here, just be sure to document any collaboration that occurred.
A Multi-slot cache
There are many access patterns that your single slot cache may not be
able to handle well. For example, the program
randomly reads a file in blocksize units, but you should find that its performance
is not as good as that of
Although this program does not read its blocks in sequential order, it
generates the block numbers to read according to a skewed distribution, which
means that some blocks are much more likely than other blocks.
As the workload becomes more skewed, a cache should become more effective
(because you increase the likelihood that you access a block you accessed
TODO: Your job is to implement a new function
that improves the performance of
Your cache should perform better than the current
randread for all
randread.c into a new file
randcache and make your
modifications there; that way you'll be able to compare performance easily.
You can change various aspects of the workload on the command line:
If you change the blocksize, you may need to change your cache block size as well.
file indicates the input file from which to read,
ops is the number of blocks to read,
skew is a number indicating how skewed the
distribution of block numbers is.
The default value for p is 75.0.
Increasing p makes the distribution more skewed; decreasing it makes it less skewed.
Your cache my be no larger than 100 KB.
See if you can get performance better than
You will need to augment your current data structures to:
- Keep more than one block in cache
- Identify each block that you do cache
- Remember the location of the file pointer (the location of the
next byte to read), in case one intersperses calls to
cache_file_read with calls to
You will need to modify your existing code to
- Use the current location of the file pointer.
You will need to implement
The data that you keep to identify a block in the cache is called
a cache tag.
We encourage you to make your cache tag the address of the first byte
of a block.
cache_read_block function is defined in
terms of blocks and therefore you'll always generate the address that
matches a tag, this will not always be the case.
If a program continues to use the
you may need to locate a position in the file other than the first
one in a cache block.
There are ways to make this simple and efficient.
We're going to go out on a limb here and assume that you'll cache things that are a power of 2 in size. Imagine that your block size is 16 bytes (this is probably not a great cache block size). Write, in binary, the starting offsets of the first four blocks of a file. What do you notice?
When we have cache blocks that are a power of 2 size and we want to locate a given
offset, we can view the offset as having two parts: a block number and an offset
within a block.
Continuing our 16-byte example, let's say that we want to find the block
containing the offset 52.
First we write 52 in binary:
We can then consider the low 4 bits the offset into a cache block and the
high 4 bits a block number.
Therefore we would expect to find offset 52 in block number 3 at offset 4.
You will undoubtedly find it handy to define some helper functions or macros (#define statements) that let you convert from file offsets to block offsets and from file offsets to cache tags. To convert from a file offset to a block offset, we want to consider only the low order bits. You might notice that subtracting 1 from the blocksize produces a number with ones in exactly those low-order bits (we sometimes call this value a mask, because it lets us mask out certain bits). What logical operation could you use to produce just the low order bits from a file position? What other logical operation(s) might you use to convert from a file offset to a cache tag (a value with 0's in all the low order bits)?
Once you've got your cache working and your timing results look good for
the caching version of
modify your cache to collect statistics.
Provide a function
cache_stats that prints out
the cache hit rate; call this function at the end of
Please fill out this survey.