Overview
In this lecture, we present some eviction policies and anomalies, and discuss
different types of file.
Full lecture notes on storage — Textbook readings
Cache slots and access patterns
- A cache slot is a region of the cache that can cache a contiguous
range of data from underlying storage
- Example for stdio cache: A buffer caching a range of file data
- Example for processor cache: A block of cache memory caching a range of physical memory
- An access pattern is a sequence of accesses to underlying storage
- Example for stdio cache: read 3 bytes; read 1 byte; seek to offset 12; …
- Example for processor cache: read 1 byte at address A; read 8 bytes at address B; …
Hits and misses
- A hit is an access that can be satisfied without contacting underlying storage
- A miss is an access that must contact underlying storage
- A read miss requires filling the cache from underlying storage
- A write miss may require flushing data from the cache to underlying storage
- The hit rate is the percentage of accesses that are hits
- Bigger hit rates are better
How many slots should a cache have?
- One (single-slot cache) or more than one (multi-slot cache)?
- What access patterns work well on single-slot caches?
- Repeated access
- Sequential access
- Reverse-sequential-access?
- What access patterns work poorly on single-slot caches?
- Random access
- Strided access
- Which of the specific caches we’ve discussed should use multiple slots?
Multi-slot cache abstraction
- Many aspects of multi-slot cache design are independent of addresses and the
type of underlying data
- Abstract these away into a reference string
- A list of addresses, each representing one access
- Multi-slot cache diagram: reference string on top, each slot is a row
- Column gives a snapshot of the cache’s contents after that access
|
1 |
2 |
3 |
4 |
1 |
2 |
5 |
1 |
2 |
3 |
4 |
5 |
A |
① |
1 |
1 |
④ |
4 |
4 |
⑤ |
5 |
5 |
5 |
5 |
5 |
B |
|
② |
2 |
2 |
① |
1 |
1 |
1 |
1 |
③ |
3 |
3 |
C |
|
|
③ |
3 |
3 |
② |
2 |
2 |
2 |
2 |
④ |
4 |
FIFO, 3 slots, hit rate 3/12
Associativity
- Some multi-slot caches restrict how slots are assigned
- Direct-mapped cache
- An address A can fit in exactly one slot S(A)
- Fully associative cache
- Any address can fit in any slot
- Software must choose a slot
- Set-associative cache
- An address A can fit in one of a subset of slots SS(A)
Eviction
- Freeing a slot for reuse is called eviction
- An eviction policy chooses which slot to evict
- For instance, if new data needs to be cached
- Direct-mapped caches have a trivial eviction policy
- A good eviction policy gets rid of data that isn’t needed in the near future
Eviction and writes
- A write cache may have dirty slots
- A dirty slot contains data newer than underlying storage
- To evict a dirty slot, it must be flushed first
- Write modified data to underlying storage
Eviction policies for fully-associative caches
When a slot is needed, evict the slot that:
- FIFO (First In, First Out)
- …was least recently filled (or assigned)
- Are there others?
Eviction policies for fully-associative caches
When a slot is needed, evict the slot that:
- FIFO (First In, First Out)
- …was least recently filled (or assigned)
- LIFO (Last In, First Out)
- …was most recently filled
- LRU (Least Recently Used)
- MRU (Most Recently Used)
- LFU (Least Frequently Used)
- Priority, aging, …
Question
- Can you come up with reference strings that work well for each of those
policies?
The optimal eviction algorithm
- Bélády’s Optimal Algorithm
- When a slot is needed, evict the slot that:
- …contains the data that will be used farthest in the future
Example
|
1 |
2 |
3 |
4 |
1 |
2 |
5 |
1 |
2 |
3 |
4 |
5 |
A |
① |
|
|
|
|
|
|
|
|
|
|
|
B |
|
② |
|
|
|
|
|
|
|
|
|
|
C |
|
|
③ |
|
|
|
|
|
|
|
|
|
Optimal
|
1 |
2 |
3 |
4 |
1 |
2 |
5 |
1 |
2 |
3 |
4 |
5 |
A |
① |
|
|
|
1 |
|
|
1 |
|
③ |
|
|
B |
|
② |
|
|
|
2 |
|
|
2 |
|
④ |
|
C |
|
|
③ |
④ |
|
|
⑤ |
|
|
|
|
5 |
Optimal, 3 slots, hit rate 5/12
When can you predict the future?
More slots means better hit rate!
|
1 |
2 |
3 |
4 |
1 |
2 |
5 |
1 |
2 |
3 |
4 |
5 |
A |
① |
|
|
④ |
|
|
⑤ |
|
|
|
|
5 |
B |
|
② |
|
|
① |
|
|
1 |
|
③ |
|
|
C |
|
|
③ |
|
|
② |
|
|
2 |
|
④ |
|
FIFO, 3 slots, hit rate 3/12
|
1 |
2 |
3 |
4 |
1 |
2 |
5 |
1 |
2 |
3 |
4 |
5 |
A |
① |
|
|
|
|
|
|
|
|
|
|
|
B |
|
② |
|
|
|
|
|
|
|
|
|
|
C |
|
|
③ |
|
|
|
|
|
|
|
|
|
D |
|
|
|
④ |
|
|
|
|
|
|
|
|
FIFO, 4 slots, hit rate ?
|
1 |
2 |
3 |
4 |
1 |
2 |
5 |
1 |
2 |
3 |
4 |
5 |
A |
① |
|
|
|
1 |
|
⑤ |
|
|
|
④ |
|
B |
|
② |
|
|
|
2 |
|
① |
|
|
|
⑤ |
C |
|
|
③ |
|
|
|
|
|
② |
|
|
|
D |
|
|
|
④ |
|
|
|
|
|
③ |
|
|
FIFO, 4 slots, hit rate 2/12
- Well, maybe not.
- This is called Bélády’s anomaly
More slots means better hit rate for LRU
|
1 |
2 |
3 |
4 |
1 |
2 |
5 |
1 |
2 |
3 |
4 |
5 |
A |
① |
|
|
④ |
|
|
⑤ |
|
|
③ |
|
|
B |
|
② |
|
|
① |
|
|
1 |
|
|
④ |
|
C |
|
|
③ |
|
|
② |
|
|
2 |
|
|
⑤ |
LRU, 3 slots, hit rate 2/12
|
1 |
2 |
3 |
4 |
1 |
2 |
5 |
1 |
2 |
3 |
4 |
5 |
A |
① |
|
|
|
1 |
|
|
1 |
|
|
|
⑤ |
B |
|
② |
|
|
|
2 |
|
|
2 |
|
|
|
C |
|
|
③ |
|
|
|
⑤ |
|
|
|
④ |
|
D |
|
|
|
④ |
|
|
|
|
|
③ |
|
|
LRU, 4 slots, hit rate 4/12
Cache correctness
- Caches aim for transparency
- Access results are independent of cache
- Cached reads return same data as direct reads
- Cached writes perform same ultimate modifications as direct writes
- A fully transparent cache is called coherent
- Not every cache is coherent!
- Processor cache is coherent
- Buffer cache is coherent
- Standard I/O cache…?
coherence.cc
Forking, piping, and stdio
Streams vs. random-access files
- The Unix file descriptor abstraction models many sources of data as file
descriptors
- Unix’s big idea: Everything is a file
- But not all sources of data are the same
- Compare a stone tablet and a speech
- Random-access file
- Has a file position
- Seekable
- Usually mappable
- Finite
- Example: Files on drives
- Stream
- Does not have a file position
- Not seekable (
lseek
returns -1)
- Not mappable
- Possibly infinite
- Example: Typed characters, program output
Special files
/dev/null
: Contains nothing
- Any read returns end of file
- Any write succeeds
/dev/urandom
: Contains random data
- Any read returns random bytes
/dev/zero
: Contains zero bytes
- Any read returns all zeros
Streams
program1 | program2
- Network connections
Write semantics
- Always safe to read more data than requested
- Not necessarily safe to write more data than requested!
- Safe with coherent caches, not safe with incoherent caches
./w-stdiobyterev
vs. ./r-stdiobyterev
Memory mapping
mmap
system call
r-mmapbyte
, w-mmapbyte