Overview
In this lecture, we introduce the memory hierarchy and discuss how files are read and written.
Full lecture notes on storage — Textbook readings
File I/O via system calls
- The C++ standard defines library functions for file access
- C-like calls:
fopen
,fread
,fwrite
,printf
,STDIN
,STDOUT
… - C++-like object interfaces:
std::iostream
,std::cin
,std::cout
…
- C-like calls:
- These functions are implemented in terms of system calls
open(filename, mode)
: Open a file; returns a file descriptorman 2 open
(system calls are defined in manual section 2)
read(fd, buf, sz)
: Read up tosz
bytes fromfd
intobuf
; return number of bytes readwrite(fd, buf, sz)
: Write up tosz
bytes frombuf
intofd
; return number of bytes written
Write data to a file one byte at a time
size_t n = 0;
while (n < size) {
int ch = fputc(buf[0], f);
if (ch == EOF) {
perror("write");
exit(1);
}
++n;
}
fputc(buf[0], f)
≈fwrite(buf, 1, 1, f)
, with different error return convention- Check the manual!
Write data to a file one byte at a time II
size_t n = 0;
while (n < size) {
ssize_t r = write(fd, buf, 1);
if (r != 1) {
perror("write");
exit(1);
}
++n;
}
Write data to memory one byte at a time
- ?
Question 1
- How long will it take each program to write data one byte at a time?
- Will the results differ by operating system? How much?
Why is membyte
faster than osbyte
?
Why is stdiobyte
faster than osbyte
?
Why is osbyte
faster than syncbyte
?
w-syncbyte.cc
opens the file in a mode where the kernel immediately completes every write * Thewrite
system call does not return until the drive’s data is modified
Storage hierarchy
- Also called “memory hierarchy”
Expense of storage
- Expense is a major factor in the construction of computer systems
- Faster technologies are often more expensive
- Older, slower technologies are often cheaper
- The relative and absolute expenses of storage technologies may be the single thing about computer systems that has changed the most dramatically over time
Question 2
- How much more did a megabyte of memory cost in 1955 relative to now?
- How much more did a megabyte of hard disk storage cost in 1955 relative to now?
Historical costs of storage
Absolute costs ($/MB):
Year | Memory (DRAM) | Flash/SSD | Hard disk |
---|---|---|---|
~1955 | $411,000,000 | $6,230 | |
1970 | $734,000 | $260.00 | |
1990 | $148.20 | $5.45 | |
2003 | $0.09 | $0.305 | $0.00132 |
2010 | $0.019 | $0.00244 | $0.000073 |
2018 | $0.0059 | $0.00015 | $0.000020 |
Relative costs (relative to hard disk storage in 2018):
Year | Memory | Flash/SSD | Hard disk |
---|---|---|---|
~1955 | 20,500,000,000,000 | 312,000,000 | |
1970 | 36,700,000,000 | 13,000,000 | |
1990 | 7,400,000 | 270,000 | |
2003 | 4,100 | 15,200 | 6.6 |
2010 | 950 | 122 | 3.6 |
2018 | 29.5 | 7.5 | 1 |
(Processor speed has also increased a lot—from 0.002 MIPS in 1951 to 750,000 MIPS now—but this is “only” 375,000,000x.)
Caching
- Cache: a small amount of fast storage used to speed up access to slower storage
- Computer systems are rife with caches
- One of the most important patterns you learn in this class
- Registers as a cache for primary memory
- Memory as a cache for flash/hard disk
- Life is rife with caches
Investigating the standard I/O cache
strace
- Powerful Linux program snoops on another program’s system calls
- Prints a human-readable summary of those system calls to a file
strace -o strace.out PROGRAM ARGUMENTS…
-f
: follow forks (subprocesses)-s N
: Print more bytes of string arguments/return valuesman strace
for more