Overview
In this lecture, we describe access costs, metrics, and benefits of caching, and continue our exploration of the default standard I/O cache.
Full lecture notes on storage — Textbook readings
Cache abstraction
Storage access costs
- Caches optimize access to storage
- Storage accesses are reads and writes
- Read: copy N units of data from storage
- Write: copy N units of data to storage
- Break down the cost of a read or write operation
Cost model
- A storage access involves two costs
- Per-request cost R (measured in units of time)
- Per-unit cost U (measured in units of time per unit of data)
- Different from operation to operation
- Cost of an access operation accessing N units of data: C = R + NU
Question 1
- What are some examples of per-request and per-unit costs in storage access operations?
Metrics: Latency and throughput
- Latency: The time it takes to access a unit of data
- Measured in seconds
- Smaller numbers are better
- Throughput: The rate at which data can be accessed
- Measured in units per second (e.g., bytes per second)
- Larger numbers are better
Relationships between latency and throughput
- For some storage technologies, latency = 1/throughput
- True random access
- For others, these metrics can diverge
- Hard disk drive throughput: ~1–250 MiB/s
- SSD (flash drive) throughput: ~50–3000 MiB/s
High-latency, high-throughput I/O
Latency, throughput, and access costs
- C = R + NU
- When will 1/latency be close to throughput?
- When R \ll U
- When might 1/latency and throughput diverge?
- When R \gg U
Reducing access costs
- Reduce number of requests
- Avoid redundant requests
- Do the same work in fewer requests
Cache optimizations
- Batching (read or write)
- Prefetching (read)
- Write coalescing (write)
- Parallel access (read or write)
Batching
- Combine multiple requests into one request
- Example: Read 4096 bytes in 1 request instead of 4096
- Reduces total per-request cost R
Prefetching
- Fetch data before it is needed
- Example: Assume sequential access and read more data than user requested
- Ideally, cache reads underlying storage before user reads from cache
- Reduces number of requests
Write coalescing
- Do not write data that will be overwritten later
- Example: Assume a cache line is updated multiple times
- Think a local variable updated in a loop, or multiple local variables in a cache line updated in close proximity
- Ideally, cache writes underlying storage after user writes to cache
- Reduces number of requests
Parallel access
- Perform accesses in parallel with other work
Question 2
- How might these optimization strategies hurt performance?
- How might these optimizations change semantics (produce different observable results)?
Cache correctness
- Caches aim for transparency
- Result of accesses should be the same despite presence of cache
- Cached reads return same data as direct reads
- Cached writes perform same ultimate modifications as direct writes
- A fully transparent cache is called coherent
- Not every cache is coherent!
- Processor cache is coherent
- Buffer cache is coherent
- Standard I/O cache…?
coherence.cc
Exploring performance of file access
w-osbyte
vs.w-osblock
w-stdiobyte
vs.w-stdioblock
Exploring performance of access patterns
r-osbyte
vs.r-osbyterev
r-stdiobyte
vs.r-stdiobyterev