Storage 3: Eviction policies and coherence

Overview

In this lecture, we present some eviction policies and anomalies, and discuss different types of file.

Full lecture notes on storage — Textbook readings

Cache slots and access patterns

A cache slot is a region of the cache that can cache a contiguous range of data from underlying storage
- Example for stdio cache: A buffer caching a range of file data
- Example for processor cache: A block of cache memory caching a range of physical memory
An access pattern is a sequence of accesses to underlying storage
- Example for stdio cache: read 3 bytes; read 1 byte; seek to offset 12; …
- Example for processor cache: read 1 byte at address A; read 8 bytes at address B; …

Hits and misses

A hit is an access that can be satisfied without contacting underlying storage
A miss is an access that must contact underlying storage
- A read miss requires filling the cache from underlying storage
- A write miss may require flushing data from the cache to underlying storage
The hit rate is the percentage of accesses that are hits
- Bigger hit rates are better

How many slots should a cache have?

One (single-slot cache) or more than one (multi-slot cache)?
What access patterns work well on single-slot caches?
- Repeated access
- Sequential access
- Reverse-sequential-access?
What access patterns work poorly on single-slot caches?
- Random access
- Strided access
Which of the specific caches we’ve discussed should use multiple slots?

Multi-slot cache abstraction

Many aspects of multi-slot cache design are independent of addresses and the type of underlying data
Abstract these away into a reference string
- A list of addresses, each representing one access
Multi-slot cache diagram: reference string on top, each slot is a row
- Column gives a snapshot of the cache’s contents after that access

	1	2	3	4	1	2	5	1	2	3	4	5
A	①	1	1	④	4	4	⑤	5	5	5	5	5
B		②	2	2	①	1	1	1	1	③	3	3
C			③	3	3	②	2	2	2	2	④	4

FIFO, 3 slots, hit rate 3/12

Associativity

Some multi-slot caches restrict how slots are assigned
Direct-mapped cache
- An address A can fit in exactly one slot S(A)
Fully associative cache
- Any address can fit in any slot
- Software must choose a slot
Set-associative cache
- An address A can fit in one of a subset of slots SS(A)

Eviction

Freeing a slot for reuse is called eviction
An eviction policy chooses which slot to evict
- For instance, if new data needs to be cached
- Direct-mapped caches have a trivial eviction policy
A good eviction policy gets rid of data that isn’t needed in the near future

Eviction and writes

A write cache may have dirty slots
- A dirty slot contains data newer than underlying storage
To evict a dirty slot, it must be flushed first
- Write modified data to underlying storage

Eviction policies for fully-associative caches

When a slot is needed, evict the slot that:

FIFO (First In, First Out)
- …was least recently filled (or assigned)
Are there others?

Eviction policies for fully-associative caches

When a slot is needed, evict the slot that:

FIFO (First In, First Out)
- …was least recently filled (or assigned)
LIFO (Last In, First Out)
- …was most recently filled
LRU (Least Recently Used)
- …was least recently used
MRU (Most Recently Used)
- …was most recently used
LFU (Least Frequently Used)
- …was used least often
Priority, aging, …

Question

Can you come up with reference strings that work well for each of those policies?

The optimal eviction algorithm

Bélády’s Optimal Algorithm
- When a slot is needed, evict the slot that:
- …contains the data that will be used farthest in the future

Example

	1	2	3
A	①
B		②
C			③

Optimal

	1	2	3	4	1	2	5	1	2	3	4	5
A	①				1			1		③
B		②				2			2		④
C			③	④			⑤					5

Optimal, 3 slots, hit rate 5/12

When can you predict the future?

More slots means better hit rate!

	1	2	3	4	1	2	5	1	2	3	4	5
A	①			④			⑤					5
B		②			①			1		③
C			③			②			2		④

FIFO, 3 slots, hit rate 3/12

	1	2	3	4
A	①
B		②
C			③
D				④

FIFO, 4 slots, hit rate ?

	1	2	3	4	1	2	5	1	2	3	4	5
A	①				1		⑤				④
B		②				2		①				⑤
C			③						②
D				④						③

FIFO, 4 slots, hit rate 2/12

Well, maybe not.
This is called Bélády’s anomaly

More slots means better hit rate for LRU

	1	2	3	4	1	2	5	1	2	3	4	5
A	①			④			⑤			③
B		②			①			1			④
C			③			②			2			⑤

LRU, 3 slots, hit rate 2/12

	1	2	3	4	1	2	5	1	2	3	4	5
A	①				1			1				⑤
B		②				2			2
C			③				⑤				④
D				④						③

LRU, 4 slots, hit rate 4/12

Cache correctness

Caches aim for transparency
- Access results are independent of cache
- Cached reads return same data as direct reads
- Cached writes perform same ultimate modifications as direct writes
A fully transparent cache is called coherent
Not every cache is coherent!
- Processor cache is coherent
- Buffer cache is coherent
- Standard I/O cache…?

`coherence.cc`

Forking, piping, and stdio

Streams vs. random-access files

The Unix file descriptor abstraction models many sources of data as file descriptors
- Unix’s big idea: Everything is a file
But not all sources of data are the same
- Compare a stone tablet and a speech
Random-access file
- Has a file position
- Seekable
- Usually mappable
- Finite
- Example: Files on drives
Stream
- Does not have a file position
- Not seekable (lseek returns -1)
- Not mappable
- Possibly infinite
- Example: Typed characters, program output

Special files

/dev/null: Contains nothing
- Any read returns end of file
- Any write succeeds
/dev/urandom: Contains random data
- Any read returns random bytes
/dev/zero: Contains zero bytes
- Any read returns all zeros

Streams

program1 | program2
Network connections

Write semantics

Always safe to read more data than requested
Not necessarily safe to write more data than requested!
- Safe with coherent caches, not safe with incoherent caches
- ./w-stdiobyterev vs. ./r-stdiobyterev

Memory mapping

mmap system call
r-mmapbyte, w-mmapbyte