Lecture 18 Thoughts
This large table shows results for the CS61 IO benchmarks. Most reported
numbers are medians of five trials, and most expreriments read or wrote
10MB (10 << 20
bytes) of data. Two significant digits are given. I ran
the experiments on a recent (2011) iMac and an oldish (2007) ThinkPad
T61. Both had hard disks.
Google spreadsheet with this data (you can download, copy, make graphs, etc.)
Test | Description | Rate (byte/sec, iMac) | Rate (byte/sec, ThinkPad) |
---|---|---|---|
w01-syncbyte | block size 1, sequential, syscalls, O_SYNC | 2,600 | 27 |
w02-syncblock | block size 512, sequential, syscalls, O_SYNC | 750,000 | 14,000 |
w02-syncblock4096 | block size 4096, sequential, syscalls, O_SYNC | 16,000,000 | 110,000 |
w03-byte | block size 1, sequential, syscalls (buffer cache) | 900,000 | 480,000 |
w04-block | block size 512, sequential, syscalls | 320,000,000 | 160,000,000 |
w04-block4096 | block size 4096, sequential, syscalls | 980,000,000 | 450,000,000 |
w05-stdiobyte | block size 1, sequential, stdio | 53,000,000 | 31,000,000 |
w06-stdioblock | block size 512, sequential, stdio | 790,000,000 | 420,000,000 |
w06-stdioblock4096 | block size 4096, sequential, stdio | 970,000,000 | 450,000,000 |
r01-byte | block size 1, sequential, syscalls | 2,200,000 | 1,400,000 |
r02-block | block size 512, sequential, syscalls | 980,000,000 | 410,000,000 |
r02-block4096 | block size 4096, sequential, syscalls | 4,000,000,000 | 640,000,000 |
r04-stdiobyte | block size 1, sequential, stdio | 180,000,000 | 70,000,000 |
r05-stdioblock | block size 512, sequential, stdio | 3,100,000,000 | 940,000,000 |
r05-stdioblock4096 | block size 4096, sequential, stdio | 3,400,000,000 | 1,400,000,000 |
r06-stridebyte | block size 1, stride 220, syscalls | 1,400,000 | 940,000 |
r07-strideblock | block size 512, stride 220, syscalls | 640,000,000 | 330,000,000 |
r07-strideblock4096 | block size 4096, stride 220, syscalls | 3,100,000,000 | 560,000,000 |
r08-stridestdiobyte | block size 1, stride 220, stdio | 900,000 | 610,000 |
r09-stridestdioblock | block size 512, stride 220, stdio | 410,000,000 | 230,000,000 |
r09-stridestdioblock4096 | block size 4096, stride 220, stdio | 2,600,000,000 | 800,000,000 |
r10-stridestdiomulti | block size 1, stride 220, stdio, 10 files (one per stride) | 63,000,000 | 20,000,000 |
r11-mmapbyte | block size 1, sequential, memory-mapped I/O | 620,000,000 | 310,000,000 |
r12-mmapblock | block size 512, sequential, memory-mapped I/O | 2,200,000,000 | 840,000,000 |
r12-mmapblock4096 | block size 4096, sequential, memory-mapped I/O | 2,300,000,000 | 620,000,000 |
r13-mmapstridebyte | block size 1, stride 220, memory-mapped I/O | 160,000,000 | 200,000,000 |
Single-byte block size
In class, we focused on selected comparisons. For instance, here are all the results for byte-at-a-time I/O, in three groups (writes, sequential reads, and strided reads).
Test | Description | Rate (byte/sec, iMac) | Rate (byte/sec, ThinkPad) |
---|---|---|---|
w01-syncbyte | block size 1, sequential, syscalls, O_SYNC | 2,600 | 27 |
w03-byte | block size 1, sequential, syscalls (buffer cache) | 900,000 | 480,000 |
w05-stdiobyte | block size 1, sequential, stdio | 53,000,000 | 31,000,000 |
r01-byte | block size 1, sequential, syscalls | 2,200,000 | 1,400,000 |
r04-stdiobyte | block size 1, sequential, stdio | 180,000,000 | 70,000,000 |
r11-mmapbyte | block size 1, sequential, memory-mapped I/O | 620,000,000 | 310,000,000 |
r06-stridebyte | block size 1, stride 220, syscalls | 1,400,000 | 940,000 |
r08-stridestdiobyte | block size 1, stride 220, stdio | 900,000 | 610,000 |
r10-stridestdiomulti | block size 1, stride 220, stdio, 10 files (one per stride) | 63,000,000 | 20,000,000 |
r13-mmapstridebyte | block size 1, stride 220, memory-mapped I/O | 160,000,000 | 200,000,000 |
4096-byte blocks
Test | Description | Rate (byte/sec, iMac) | Rate (byte/sec, ThinkPad) |
---|---|---|---|
w02-syncblock4096 | block size 4096, sequential, syscalls, O_SYNC | 16,000,000 | 110,000 |
w04-block4096 | block size 4096, sequential, syscalls | 980,000,000 | 450,000,000 |
w06-stdioblock4096 | block size 4096, sequential, stdio | 970,000,000 | 450,000,000 |
r02-block4096 | block size 4096, sequential, syscalls | 4,000,000,000 | 640,000,000 |
r05-stdioblock4096 | block size 4096, sequential, stdio | 3,400,000,000 | 1,400,000,000 |
r12-mmapblock4096 | block size 4096, sequential, memory-mapped I/O | 2,300,000,000 | 620,000,000 |
r07-strideblock4096 | block size 4096, stride 220, syscalls | 3,100,000,000 | 560,000,000 |
r09-stridestdioblock4096 | block size 4096, stride 220, stdio | 2,600,000,000 | 800,000,000 |
Running your own tests
For running experiments it’s convenient to have programs that take
options. We’ve provided these too, in the cs61-lectures/l18
directory—the writer
, reader
, and memreader
programs. Their
options are:
-n SIZE
Number of bytes to read or write (defaults to 10 << 20
).
-b BLOCKSIZE
Block size in bytes (defaults to 1, i.e., byte-at-a-time I/O).
-s STRIDE
Stride in bytes. Defaults to 0, i.e., sequential I/O. If you set -s
equal to -b
, this has the same effect as sequential I/O, but contains
the same number of fseek
or lseek
calls as a strided access pattern.
You can use this to distinguish the cost of the additional system calls
from the cost of the strided access pattern.
-S USESTDIO
1 to use stdio, 0 to use syscalls (the default). (writer
and reader
only)
-y USESYNC
1 to open the file with O_SYNC
. 0 is the default. (writer
only)
-p PRINTFREQUENCY
Print a progress message every PRINTFREQUENCY bytes. Defaults to 0, which means don’t print a message until the test completes.
FILENAME
File to read or write (defaults to data
).
Here are the l18 command lines corresponding to l17 programs. The l18 programs have more complicated and can thus run slower than the l17 programs. (This is particularly clear for byte-at-a-time memory-mapped I/O.)
l17 program | l18 command line |
---|---|
w01-syncbyte | ./writer -y 1 |
w02-syncblock | ./writer -b 512 -y 1 |
w02-syncblock4096 | ./writer -b 4096 -y 1 |
w03-byte | ./writer |
w04-block | ./writer -b 512 |
w04-block4096 | ./writer -b 4096 |
w05-stdiobyte | ./writer -S 1 |
w06-stdioblock | ./writer -S 1 -b 512 |
w06-stdioblock4096 | ./writer -S 1 -b 4096 |
r01-byte | ./reader |
r02-block | ./reader -b 512 |
r02-block4096 | ./reader -b 4096 |
r04-stdiobyte | ./reader -S 1 |
r05-stdioblock | ./reader -S 1 -b 512 |
r05-stdioblock4096 | ./reader -S 1 -b 4096 |
r06-stridebyte | ./reader -s 1048576 |
r07-strideblock | ./reader -s 1048576 -b 512 |
r07-strideblock4096 | ./reader -s 1048576 -b 4096 |
r08-stridestdiobyte | ./reader -S 1 -s 1048576 |
r09-stridestdioblock | ./reader -S 1 -s 1048576 -b 512 |
r09-stridestdioblock4096 | ./reader -S 1 -s 1048576 -b 4096 |
r10-stridestdiomulti | N/A |
r11-mmapbyte | ./memreader |
r12-mmapblock | ./memreader -b 512 |
r12-mmapblock4096 | ./memreader -b 4096 |
r13-mmapstridebyte | ./memreader -s 1048576 |