In today’s section, we’ll walk through a selection of our storage exercises, which are problems taken from prior exams. Also take the opportunity to ask questions about the problem set and class material.
Simultaneously enrolled students are responsible for attending section and self-reporting their attendance at www.tinyurl.com/cs61sections.
All exercises assume a 64-bit x86-64 architecture unless otherwise stated.
The main subjects of these exercises are:
- IO-19: File system calls.
- IO-20: Types of cache.
- IO-4: Reading straces.
- IO-14: Single-slot cache implementation.
- IO-17: Reference strings and hit rates.
IO-19. File system calls
QUESTION IO-19A. A program makes these system calls:
int fd = open("f.txt", O_WRONLY | O_CREAT | O_TRUNC);
ssize_t nw = write(fd, "CS121 is awesome!", 17); // returned 17
What following series of system calls would ensure that, after all system
calls complete, the file f.txt
contains the text “CS 61 is terrible
”
(without the quotation marks)? Minimize the number of bytes written.
QUESTION IO-19B. Which of the following file access patterns might have similar
output from the strace
utility? List all that apply or say “none.”
- Sequential byte writes using stdio
- Sequential byte writes using
mmap
- Sequential byte writes using system calls
QUESTION IO-19C. Which of the following file access patterns might have similar
output from the strace
utility? List all that apply or say “none.”
- Sequential byte writes using stdio
- Sequential block writes using stdio
- Sequential byte writes using system calls
- Sequential block writes using system calls
QUESTION IO-19D. Which of the following file access patterns might have similar
output from the strace
utility? List all that apply or say “none.”
- Reverse-sequential byte writes using stdio
- Reverse-sequential block writes using stdio
- Reverse-sequential byte writes using system calls
- Reverse-sequential block writes using system calls
IO-20. Caches Big, Fast, and Cheap
QUESTION IO-20A. We discussed several kinds of computer storage in class, including:
- Drives (disks and SSDs)
- primary Memory
- processor Cache memory
- Registers
Put those storage technologies in order by latency, slowest first. (An answer might be “DMCR”.)
QUESTION IO-20B. Put those storage technologies in order by cost per byte, cheapest first.
QUESTION IO-20C. Put those storage technologies in order by capacity in bytes on a typical computer, smallest first.
QUESTION IO-20D. Which storage technology serves as the underlying storage for each of the following caches? If unsure, explain briefly.
- Buffer cache
- Stdio cache
- Processor cache
QUESTION IO-20E. True or false? Given a cache with two or more blocks, implementing it as a fully associative cache would always produce as good or better hit rates than a direct-mapped implementation.
QUESTION IO-20F. True or false? Prefetching bytes from a file on disk into the buffer cache can cause the buffer cache to become incoherent.
IO-4. I/O caching and strace
Elif Batuman is investigating several program executables left behind by her
ex-roommate Fyodor. She knows these executables perform byte-at-a-time
I/O, but she’s not sure what kinds of caches they use. To find out, she runs
each executable under strace
in the following way:
strace -o strace.txt ./EXECUTABLE files/text1meg.txt > files/out.txt
Help her figure out properties of these programs based on their system call traces.
QUESTION IO-4A. Program ./mysterya
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x8193000
brk(0x81b5000) = 0x81b5000
read(3, "A", 1) = 1
write(1, "A", 1) = 1
read(3, "\n", 1) = 1
write(1, "\n", 1) = 1
read(3, "A", 1) = 1
write(1, "A", 1) = 1
read(3, "'", 1) = 1
write(1, "'", 1) = 1
read(3, "s", 1) = 1
write(1, "s", 1) = 1
...
List at least one option in each column.
|
|
|
|
QUESTION IO-4B. Program ./mysteryb
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x96c5000
brk(0x96e6000) = 0x96e6000
read(3, "A\nA's\nAA's\nAB's\nABM's\nAC's\nACTH'"..., 2048) = 2048
write(1, "A\nA's\nAA's\nAB's\nABM's\nAC's\nACTH'"..., 2048) = 2048
read(3, "kad\nAkron\nAkron's\nAl\nAl's\nAla\nAl"..., 2048) = 2048
write(1, "kad\nAkron\nAkron's\nAl\nAl's\nAla\nAl"..., 2048) = 2048
...
List at least one option in each column.
|
|
|
|
QUESTION IO-4C. Program ./mysteryc
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x9064000
brk(0x9085000) = 0x9085000
fstat64(3, {st_mode=S_IFREG|0664, st_size=1048576, ...}) = 0
lseek(3, 1046528, SEEK_SET) = 1046528
read(3, "ingau\nRheingau's\nRhenish\nRhianno"..., 2048) = 2048
write(1, "oR\ntlevesooR\ns'yenooR\nyenooR\ns't"..., 2048) = 2048
lseek(3, 1044480, SEEK_SET) = 1044480
read(3, "Quinton\nQuinton's\nQuirinal\nQuisl"..., 2048) = 2048
write(1, "ehR\neehR\naehR\ns'hR\nhR\nsdlonyeR\ns"..., 2048) = 2048
lseek(3, 1042432, SEEK_SET) = 1042432
read(3, "emyslid's\nPrensa\nPrensa's\nPrenti"..., 2048) = 2048
write(1, "\ns'nailitniuQ\nnailitniuQ\nnniuQ\ns"..., 2048) = 2048
lseek(3, 1040384, SEEK_SET) = 1040384
read(3, "Pindar's\nPinkerton\nPinocchio\nPin"..., 2048) = 2048
write(1, "rP\ndilsymerP\ns'regnimerP\nregnime"..., 2048) = 2048
...
List at least one option in each column.
|
|
|
|
QUESTION IO-4D. Program ./mysteryd
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x9a0e000
brk(0x9a2f000) = 0x9a2f000
fstat64(3, {st_mode=S_IFREG|0664, st_size=1048576, ...}) = 0
lseek(3, 1048575, SEEK_SET) = 1048575
read(3, "o", 2048) = 1
lseek(3, 1048574, SEEK_SET) = 1048574
read(3, "Ro", 2048) = 2
lseek(3, 1048573, SEEK_SET) = 1048573
read(3, "\nRo", 2048) = 3
...
lseek(3, 1046528, SEEK_SET) = 1046528
read(3, "ingau\nRheingau's\nRhenish\nRhianno"..., 2048) = 2048
write(1, "oR\ntlevesooR\ns'yenooR\nyenooR\ns't"..., 2048) = 2048
lseek(3, 1046527, SEEK_SET) = 1046527
read(3, "eingau\nRheingau's\nRhenish\nRhiann"..., 2048) = 2048
lseek(3, 1046526, SEEK_SET) = 1046526
read(3, "heingau\nRheingau's\nRhenish\nRhian"..., 2048) = 2048
...
List at least one option in each column.
|
|
|
|
QUESTION IO-4E. Program ./mysterye
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x93e5000
brk(0x9407000) = 0x9407000
read(3, "A", 1) = 1
read(3, "\n", 1) = 1
read(3, "A", 1) = 1
...
read(3, "A", 1) = 1
read(3, "l", 1) = 1
write(1, "A\nA's\nAA's\nAB's\nABM's\nAC's\nACTH'"..., 1024) = 1024
read(3, "t", 1) = 1
read(3, "o", 1) = 1
read(3, "n", 1) = 1
...
List at least one option in each column.
|
|
|
|
QUESTION IO-4F. Program ./mysteryf
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x9281000
brk(0x92a3000) = 0x92a3000
read(3, "A\nA's\nAA's\nAB's\nABM's\nAC's\nACTH'"..., 4096) = 4096
write(1, "A", 1) = 1
write(1, "\n", 1) = 1
write(1, "A", 1) = 1
...
write(1, "A", 1) = 1
write(1, "l", 1) = 1
read(3, "ton's\nAludra\nAludra's\nAlva\nAlvar"..., 4096) = 4096
write(1, "t", 1) = 1
write(1, "o", 1) = 1
write(1, "n", 1) = 1
...
List at least one option in each column.
|
|
|
|
IO-14. Cache code
Several famous musicians have just started working on the standard I/O problem set. They share the following code for their read-only, sequential, single-slot cache:
struct io61_file {
int fd;
unsigned char buf[4096];
size_t pos; // position of next character to read in `buf`
size_t sz; // number of valid characters in `buf`
};
int io61_readc(io61_file* f) {
if (f->pos >= f->sz) {
f->pos = f->sz = 0;
ssize_t nr = read(f->fd, f->buf, sizeof(f->buf));
if (nr <= 0) {
f->sz = 0;
return -1;
} else {
f->sz = nr;
}
}
int ch = f->buf[f->pos];
++f->pos;
return ch;
}
But they have different io61_read
implementations. Donald
(Lambert)’s is:
ssize_t io61_read(io61_file* f, char* buf, size_t sz) {
return read(f->fd, buf, sz);
}
Solange (Knowles)’s is:
ssize_t io61_read(io61_file* f, char* buf, size_t sz) {
for (size_t pos = 0; pos < sz; ++pos, ++buf) {
*buf = io61_readc(f);
}
return sz;
}
Caroline (Shaw)’s is:
ssize_t io61_read(io61_file* f, char* buf, size_t sz) {
if (f->pos >= f->sz) {
return read(f->fd, buf, sz);
} else {
int ch = io61_readc(f);
if (ch < 0) {
return 0;
}
*buf = ch;
return io61_read(f, buf + 1, sz - 1) + 1;
}
}
You are testing each of these musicians’ codes by executing a sequence
of io61_readc
and/or io61_read
calls on an input file and
printing the resulting characters to standard output. There are no
seeks, and your test programs print until end of file, so your tests’
output should equal the input file’s contents.
You should assume for these questions that no read
system call
ever returns -1.
QUESTION IO-14A. Describe an access pattern—that is, a sequence of
io61_readc
and/or io61_read
calls (with lengths)—for which
Donald’s code can return incorrect data.
QUESTION IO-14B. Which of these musicians’ codes can generate an output file with incorrect length?
QUESTION IO-14C. Give a sequence of io61 calls for which Solange’s code will outperform Donald’s, or vice versa, and explain briefly.
QUESTION IO-14D. Suggest a small change (≤10 characters) to Caroline’s code that could make it perform at least as well as both Solange’s and Donald’s codes on all access patterns. Explain briefly.
IO-17. Reference strings and hit rates
QUESTION IO-17A. Write a purely-sequential reference string containing at least five accesses.
QUESTION IO-17B. What is the hit rate for this reference string? Tell us the eviction algorithm and number of slots you’ve chosen.
Parts C and D concern this ten-element reference string:
1 2 1 2 3 4 1 5 1 1
We consider executing this reference string starting with different cache contents.
QUESTION IO-17C. A three-slot LRU cache processes this reference string and observes a 70% hit rate. What are the initial contents of the cache?
QUESTION IO-17D. A three-slot FIFO cache with initial contents 4, 1, and 2 (in slots A, B, and C, respectively) processes the reference string and observes a 60% hit rate. Which slot was next up for eviction when the reference string began? (Assume the slots were initially filled in alphabetical order.)
The eviction algorithms we saw in class are entirely reactive: they only insert a block when that block is referenced. This limits how well the cache can perform. A read cache can also be proactive by inserting blocks before they’re needed, possibly speeding up later accesses. This is the essence of prefetching.
In a proactive caching model, the cache can evict and load two or more blocks per access in the reference string. A prefetching policy decides which additional, non-accessed blocks to load.
QUESTION IO-17E. Describe an access pattern for which the following prefetching policy would be effective: When filling a cache with block A, also load block A+1 into another slot.
QUESTION IO-17F. Write a reference string and name an eviction policy and/or cache size for which this prefetching policy would be less effective (have a lower hit rate) than no prefetching at all.