# Storage 1: Caches

First part of lecture: Buffer overflow

## Caches

A cache is a small amount of fast storage used to speed up access to larger, slower storage.

A cache basically works by storing copies of data whose primary home is on slower storage. A program can access data faster when a it’s located “nearby,” in fast storage.

Caches abound in computer systems; sometimes it seems like caching is the only performance-improving idea in systems. Processors have caches for primary memory. The operating system uses most of primary memory as a cache for disks and other stable storage devices. Running programs reserve some of their private memory to cache the operating system’s cache of the disk.

When a program processes a file, the actual computation is done using registers, which are caching values from the processor cache, which is caching data from the running program’s memory, which is caching data from the operating system, which is caching data from the disk. And modern disks contain caches inside their hardware too!

Caches also abound in everyday life. Imagine how life would differ without, say, food storage—if every time you felt hungry you had to walk to a farm and eat a carrot you pulled out of the dirt. Your whole day would be occupied with finding and eating food! Instead, your refrigerator (or your dorm’s refrigerator) acts as a cache for your neighborhood grocery store, and that grocery store acts as a cache for all the food producers worldwide. Caching food is an important aspect of complex animals’ biology too. Your stomach and intestines and fat cells act as caches for the energy stored in the food you eat, allowing you to spend time doing things other than eating. And your colon and bladder act as caches for the waste products you produce, allowing you to travel far from any toilet.

## File I/O

The problem set focuses on file I/O—that is, input and output (“I” and “O”) for files. You’re likely familiar with files already. The stdin, stdout, and stderr objects in C are references to files, as are C++’s std::cin, std::cout, and std::cerr. C library functions like fopen, fread, fprintf, fscanf, and fclose all perform operations on files. These library routines are part of a package called standard I/O or stdio for short. They are implemented in terms of lower-level abstractions; the Linux/Mac OS X versions are called file descriptors, which are accessed using system calls such as open, read, and write. Although file descriptors and system calls are often hidden from programmers by stdio, we can also access them directly.

We used simple I/O benchmark programs to evaluate the real-world impact of caches. Specifically, we ran many programs that write data to disk—my laptop’s SSD drive, via my virtual machine. These programs accomplish the same goal, but in several different ways.

• w01-syncbyte writes the file (1) one byte at a time, (2) using direct system calls (write), (3) synchronously to the disk.

“Synchronously” means that each write operation completes only when the data has made its way through all layers of caching out to the true disk hardware. (At least that’s what it should mean!) We request synchronous writes by opening the file with the O_SYNC option to open.

On my laptop, w01-syncbyte can write about 2,400 bytes per second (it varies).

• w03-byte writes the file (1) one byte at a time, (2) using direct system calls (write), (3) asynchronously—in other words, allowing the operating system to use its caches.

On my laptop, w03-byte can write about 870,000 bytes per second—360x more than w01-syncbyte!

• w05-stdiobyte writes the file (1) one byte at a time, (2) using stdio library function calls (fwrite), (3) asynchronously.

On my laptop, w05-stdiobyte can write about 52,000,000 bytes per second—22,000x more than w01-syncbyte!

Even though all three programs write one byte at a time, caching makes the fastest one 22,000x faster. The w05-stdiobyte program uses two distinct layers of software cache: a stdio cache, managed by the stdio library inside the program, uses caching to reduce the cost of system calls; and the buffer cache, managed by the operating system inside the kernel, uses caching to reduce the cost of disk access. The combination of these is 22,000x.

We can get even faster, though, if we write more than one byte at a time, which is sort of like implementing an application-level cache for the data we plan to write.

• w02-syncblock writes the file (1) 512 bytes at a time, (2) using system calls, (3) synchronously to the disk. On my laptop, it can write about 1,300,000 bytes per second.

• w04-block writes the file (1) 512 bytes at a time, (2) using system calls, (3) asynchronously. On my laptop, it can write about 200,000,000 bytes per second.

• w06-stdioblock writes the file (1) 512 bytes at a time, (2) using stdio, (3) asynchronously. On my laptop, it can write about 270,000,000 bytes per second—112,500x more than w01-syncbyte!

## Synchronous writes in depth

What kinds of I/O properties make caches necessary?

We looked first at w01-syncbyte. Here’s what happens at a high level when that application writes bytes to the file.

1. The application makes a write system call asking the operating system to write a byte.

2. The operating system performs this write to its cache. But it also observes that the written file descriptor requires synchronous writes, so it begins the process of writing to the drive—in my case, an SSD.

3. This drive cannot write a single byte at a time! Most storage technologies other than primary memory read and write data in contiguous blocks, typically 4,096 or 8,192 bytes big. So for every 1-byte application write, the operating system must write an additional 4,095 surrounding bytes, just because the disk forces it to!

That 4,096-byte write is also slow, because it induces physical changes. The job of a disk drive is to preserve data in a stable fashion, meaning that the data will outlast accidents like power loss. On today’s technology, the physical changes that make data stable take more time than simple memory writes.

But that’s not the worst part. SSD drives have a strange feature where data can be written in 4KiB or 8KiB units, but erasing already-written data requires much larger units—typically 64KiB or even 256KiB big! The large erase, plus other SSD features like “wear leveling,” may cause the OS and drive to move all kinds of data around: our 1-byte application write can cause 256KiB or more traffic to the disk. This explosion is called write amplification.

4. So the system call only returns after 4–256KiB of data is written to the disk.

## Buffer cache

We can speed up this process using caches, and the first cache we “turned on” was the buffer cache. This is a cache in memory of file data managed by the operating system kernel. (The kernel is the operating system program that runs with full privilege over the operation of the machine. The kernel responds to system calls from application programs and makes calls out to hardware as required.)

In today’s machines and operating systems, the buffer cache ends up occupying much of the machine’s memory. It can hold data for many files, and for many drives and sources of storage, simultaneously.

Here’s what happens at a high level when w03-byte writes bytes to the file.

1. The application makes a write system call asking the operating system to write a byte (as before).

2. The operating system performs this write to the buffer cache.

3. The operating system immediately returns to the application!

That is, the application gets control back before the data is written to disk. The data is written to disk eventually, either after some time goes by or because the buffer cache runs out of room, but such later writes can be overlapped with other processing.

## Stdio cache

But w03-byte is still much slower than it needs to be, because it performs many expensive operations—namely system calls. These operating system invocations are slow because they require the processor to save its state, switch contexts to the kernel, process arguments, and then switch back. Another layer of caching, the stdio cache, can get rid of these system calls. The stdio cache is a cache in application memory for the buffer cache (which is in kernel memory).

Here’s what happens at a high level when w05-stdiobyte writes bytes to the file.

1. The application makes an fwrite library function call asking the stdio library to write a byte.

2. The library performs this write to its cache and immediately returns to the application.

That is, the application gets control back even before a system call is made. The data is written out of the stdio cache using a system call eventually, either when the stdio cache runs out of room, it is explicitly flushed, or the program exits.