Storage 3: The stdio cache

Cache Organization

A cache is a temporary memory store used to improve application performance by storing data that is most likely to be accessed in the future. A cache moves data from slow storage to faster storage.

Hit: desired data is in the cache
Miss: desired data is not in the cache.
Hit Rate: # hits / # accesses

Locality of reference

The goal of different caching schemes is to maximize the hit rate. The central idea in caching is determining what pieces of data should be in the cache at what time. A useful concept is access pattern locality.

Temporal locality: accesses are likely repeated closely in time
Spatial locality: accesses are likely to have close addresses

A reference string with high temporal locality might be 1, 1, 1, 1, 1 because items that you just accessed are accessed again in the near future. A reference string with high spatial locality might be 1, 2, 3, 4, 5 because items you just accessed are close to items you will access in the future (the numbers are consecutive).

Cache Associativity

A cache is structured in terms of blocks and slots. A block is the number of bytes associated with each transfer of data. In the stdio cache and the buffer cache, the block size is 4kb (4096 bytes). A slot is a space in the cache that can hold exactly one block. You can think of a block as a book and a slot as a spot on a bookshelf.

Fully associative: Any block can use any slot in the cache. Any book can be shelved anywhere on the bookshelf.
Direct mapped: Each block can only use one slot. Each book has an assigned spot on the bookshelf but multiple books may be assigned to the same spot.
Single Slot: Always direct mapped
Set associative: Cache is divided into N block sets. Each set is fully associative but the data to set mapping is direct mapped.

The direct mapped and fully associative caches are extreme versions of the set associative cache. A direct mapped cache is a set associative cache with N=1 and a fully associative cache is a set associative cache with N=total number of cache lines.

Sdio Cache

The stdio cache is managed by applications. It speeds up access to the buffer cache, which is managed by the operating system. The stdio cache has a block size of 4kb. We can immediately see the benefit of the cache in the following example. We would like to execute these instructions:

 seek(4095) 
 read(1) 
 seek(4094) 
 read(1) 
 seek(4093) 
 read(1)

With the help of the stdio cache, the number of calls to read() can be drastically reduced because the memory in the following seeks is already in the cache.

 seek(0)
 read(4096)  // bytes [0, 4096) are in the cache 
 seek(4094) 
 seek(4093)

In an unaligned cache we see the following behavior.

 seek(4096) 
 read(4096)  // bytes [4095, 8191) are in the cache 
 seek(4094) 
 read(4096)  // bytes [4094, 8190) are in the cache
 seek(4093) 
 read(4096)  // bytes [4093, 8189) are in the cache

Alignment is very important for cache performance. Alignment addresses spatial locality by grouping data into 4kb blocks.

Access Patterns

A machine will typically assume a sequential access pattern. A common cache technique is to automatically prefetch data that is sequentially downstream of a recent data access, anticipating access to downstream data in the near future. However, there are cases where a sequential access pattern does not make sense. In these cases, the computer may be reading data that the user never accesses. The posix_fadvise function allows you to tune cache behavior to perform better on nonsequential access patterns.

 int posix_fadvise(int fd, off_t offset, off_t len, int advice)

The posix_fadvise function allows a program to state their intended access pattern. The kernel uses this information to optimize the program's runtime behavior. The last argument of the function is a flag specifying how to access file data.

 POSIX_FADV_NORMAL       // no advice
 POSIX_FADV_SEQUENTIAL   
 POSIX_FADV_RANDOM 
 POSIX_FADV_NOREUSE      // the data will only be accessed once
 POSIX_FADV_WILLNEED     // the data will be accessed in the near future, read specified data into cache
 POSIX_FADV_DONTNEED     // the data will not be access in the near future, free cache pages with this data

Cache Coherence

A cache is coherent if the data inside the cache reflects the most recent updates to the underlying storage. A cache can be incoherent because data in the cache is not immediately flushed to the backing store. The stdio cache is not coherent. The stdio cache will fail if you read from a block and simultaneously overwrite the contents of the block. Subsequent calls to read will reference the cache data even if the data in the file has changed.

Write back cache: Writes in cache are not immediately sent to backing storage. The cache may contain the most recent version of the data.
Write through cache: Writes are propagated to both the cache and the underlying storage. All data is safely stored on persistent storage.

References

Your friend strace!
- We often run strace as strace -o FILE COMMAND...
- Julia Evans’s strace zine!
File descriptor notes