Lecture 10 thoughts
This is a generic description of the operation of a cache. Though closely related to the presentation in CS:APP2e, our presentation gives the material a more generic, software-based spin.
Parameters and terms
A cache is fast local storage that holds copies of data housed on slower primary storage. Caches are used to speed up future requests.
Caches are ubiquitous in computer systems—hardware and software—and also in life. We’ll especially focus on three caches:
buffer cache This software cache is managed by the operating system. It speeds up access to stable storage (disks).
stdio cache (application file cache) This software cache is managed by applications and/or application libraries. It speeds up access to the buffer cache.
processor cache This hardware cache is managed by the processor and memory system. It speeds up access to primary memory (DRAM).
There are lots of other caches too—browsers manage object caches (for images and other page “assets,” like scripts); the DNS subsystem manages mappings of site names (like “www.google.com”) to IP addresses (like 74.125.226.210).
Caches can be described in terms of a couple parameters.
block The transfer unit between the cache and primary storage. Each block has an address on primary storage.
block size B The size (in bytes) of a block. For processor caches, this is often 64 or 128; for the buffer cache, it is 4096.
slot A space in the cache that can hold a block. A slot in a processor cache is also called a line.
number of slots S The number of slots in the cache.
cache size C The total size of the cache (in bytes). C = B × S.
This assumes that the cache deals with fixed-size blocks. Although our focus caches do have fixed-size blocks, not all caches do. For instance, browser caches usually deal with page-sized units.
We might model the cache as a C structure like this:
typedef struct cache_slot {
address_type addr; // the primary-storage address contained in this slot, or NULL if slot is empty
bool dirty; // 1 if this slot’s data has been updated relative to primary storage
char* data; // cached data (size B)
} cache_slot;
cache_slot cache[S];
There’s one more parameter.
associativity E In some caches, a block with address a can be cached in only a subset of the cache’s slots. For instance, maybe a block with an even-numbered address could only use even-numbered slots. The associativity parameter E determines how many slots are usable per address. We will consider two values for E. If E = S (and E > 1), then any address can use any slot: the cache is fully flexible and we call it fully associative. If E = 1, then any address is matched to one unique slot, and we call the cache direct-mapped. The book discusses intermediate values for E in the context of processor caches. (The book also gives S a different meaning.)
A couple more terms are useful for discussing caches.
hit
An access (a read or write) is a hit when the cache can satisfy the
access without contacting primary storage. This means that the requested
address a equaled cache[s].addr
for some s
. Cache hits are good
since they’re fast.
miss An access that’s not a hit is a miss. Misses are bad since they require contacting primary storage, which is slow.
clean A cache slot is clean when its data has not been updated relative to the version on primary storage. In a read-only cache (such as a browser’s object cache), every slot is clean.
dirty A cache slot is dirty when its data has been updated (by a write operation) relative to the version on primary storage.
eviction or flushing When a block is removed from the cache, we say it is evicted. When an evicted block was written back to primary storage (because it was dirty), we also sometimes say the block was flushed.
We can divide cache misses into rough categories.
cold miss A miss is cold if the requested address was never referenced before (or, more generally, if the address was referenced at most “a long time ago,” so a reasonable observer would expect the cache to have evicted the address). Any cache, no matter how big, will suffer cold misses when an experiment begins.
capacity miss A miss is a capacity miss if the cache just wasn’t big enough. In other words, the request’s address was referenced recently, but the cache had to evict the corresponding block because it ran out of room.
conflict miss A miss is a conflict miss if it happened because the cache wasn’t associative enough. Fully-associative caches never have conflict misses.
Some more terms based on these:
cold cache A cache is cold if it contains no blocks that are useful for the current workload—for instance, if the workload is reading a file that was never read before (or, more generally, that wasn’t read “recently”).
warm cache A warm cache, in contrast, is useful for the current workload.
A cache’s contents should equal the data on primary storage, and if the cache allows writes, then writes to cached data should eventually be pushed (“flushed”) to primary storage. However, sometimes, primary storage can change underneath the cache, and not all caches react to these updates the same way.
cache coherence A cache is coherent when data in the cache always reflects the most recent updates to the primary storage. Modern processor caches are coherent relative to primary memory, and the OS buffer cache is coherent relative to local disks (because the OS mediates all access to local disks). The stdio cache, however, is not coherent.
Finally, this is a useful term for a bad situation.
thrashing A cache is thrashing when it repeatedly loads and evicts the same data.
Policies
A cache’s operation depends on several policies whose implementations vary from cache to cache. Choosing good policies is hard and interesting! It’s kind of what pset 2 is about. Here are three. Real systems implement policies like these, but often real systems’ policies are complicated and use additional information from the environment to help them make the best choices.
placement_policy(address a)
Returns the slots that might contain address a
. Returns a set of E
slot indexes. For a fully-associative cache, it returns all slot
indexes. A software cache might use hash indexing here.
eviction_policy(set`<slot_index>` P)
Given a set of slots, chooses a slot to evict. Returns a single slot
index that’s a member of P
. A smart policy will return an empty slot
(cache[s].addr == NULL
) when possible.
prefetch_policy(address a)
Returns a set of addresses that should be prefetched near a
. A good
prefetch policy will speed up future reads, but won’t evict useful data
from the cache. Often the best prefetch policy uses information about
recent accesses and/or application hints to make its choices.
Algorithms
data read(address a) {
// Return the existing cached version, if any.
P = placement_policy(a);
foreach slot s ∈ P {
if (cache[s].addr == a)
goto done; // hit!
}
// otherwise, miss!
` // `a` is not in cache; look for something to evict. `
s = eviction_policy(P);
if (cache[s].dirty)
write_block(cache[s].addr, cache[s].data);
` // Read current block for `a` into the cache. `
read_block(cache[s].addr, cache[s].data);
cache[s].addr = a;
cache[s].dirty = 0;
done:
// Maybe prefetch some nearby data.
foreach address a' ∈ prefetch_policy(a)
read_in_the_background(a'); // start prefetching that address
return cache[s].data;
}
void write(address a, data_modification mod) {
// Return the existing cached version, if any.
P = placement_policy(a);
foreach slot s ∈ P {
if (cache[s].addr == a)
goto done; // hit!
}
// otherwise, miss!
` // `a` is not in cache; look for something to evict. `
s = eviction_policy(P);
if (cache[s].dirty)
write_block(cache[s].addr, cache[s].data);
` // Maybe read current block for `a` into the cache. `
if (mod does not completely replace the block)
read_block(cache[s].addr, cache[s].data);
cache[s].addr = a;
done:
apply mod to cache[s].data;
cache[s].dirty = 1;
}