2012/Lecture18

From CS61
Jump to: navigation, search
Computer Science 61 and E61
Systems Programming and Machine Organization
This is the 2012 version of the course. Main site

Lecture 18 Thoughts

This large table shows results for the CS61 IO benchmarks. Most reported numbers are medians of five trials, and most expreriments read or wrote 10MB (10 << 20 bytes) of data. Two significant digits are given. I ran the experiments on a recent (2011) iMac and an oldish (2007) ThinkPad T61. Both had hard disks.

Google spreadsheet with this data (you can download, copy, make graphs, etc.)

Test Description Rate (byte/sec, iMac) Rate (byte/sec, ThinkPad)
w01-syncbyte block size 1, sequential, syscalls, O_SYNC 2,600 27
w02-syncblock block size 512, sequential, syscalls, O_SYNC 750,000 14,000
w02-syncblock4096 block size 4096, sequential, syscalls, O_SYNC 16,000,000 110,000
w03-byte block size 1, sequential, syscalls (buffer cache) 900,000 480,000
w04-block block size 512, sequential, syscalls 320,000,000 160,000,000
w04-block4096 block size 4096, sequential, syscalls 980,000,000 450,000,000
w05-stdiobyte block size 1, sequential, stdio 53,000,000 31,000,000
w06-stdioblock block size 512, sequential, stdio 790,000,000 420,000,000
w06-stdioblock4096 block size 4096, sequential, stdio 970,000,000 450,000,000
r01-byte block size 1, sequential, syscalls 2,200,000 1,400,000
r02-block block size 512, sequential, syscalls 980,000,000 410,000,000
r02-block4096 block size 4096, sequential, syscalls 4,000,000,000 640,000,000
r04-stdiobyte block size 1, sequential, stdio 180,000,000 70,000,000
r05-stdioblock block size 512, sequential, stdio 3,100,000,000 940,000,000
r05-stdioblock4096 block size 4096, sequential, stdio 3,400,000,000 1,400,000,000
r06-stridebyte block size 1, stride 220, syscalls 1,400,000 940,000
r07-strideblock block size 512, stride 220, syscalls 640,000,000 330,000,000
r07-strideblock4096 block size 4096, stride 220, syscalls 3,100,000,000 560,000,000
r08-stridestdiobyte block size 1, stride 220, stdio 900,000 610,000
r09-stridestdioblock block size 512, stride 220, stdio 410,000,000 230,000,000
r09-stridestdioblock4096 block size 4096, stride 220, stdio 2,600,000,000 800,000,000
r10-stridestdiomulti block size 1, stride 220, stdio, 10 files (one per stride) 63,000,000 20,000,000
r11-mmapbyte block size 1, sequential, memory-mapped I/O 620,000,000 310,000,000
r12-mmapblock block size 512, sequential, memory-mapped I/O 2,200,000,000 840,000,000
r12-mmapblock4096 block size 4096, sequential, memory-mapped I/O 2,300,000,000 620,000,000
r13-mmapstridebyte block size 1, stride 220, memory-mapped I/O 160,000,000 200,000,000

Single-byte block size

In class, we focused on selected comparisons. For instance, here are all the results for byte-at-a-time I/O, in three groups (writes, sequential reads, and strided reads).

Test Description Rate (byte/sec, iMac) Rate (byte/sec, ThinkPad)
w01-syncbyte block size 1, sequential, syscalls, O_SYNC 2,600 27
w03-byte block size 1, sequential, syscalls (buffer cache) 900,000 480,000
w05-stdiobyte block size 1, sequential, stdio 53,000,000 31,000,000
r01-byte block size 1, sequential, syscalls 2,200,000 1,400,000
r04-stdiobyte block size 1, sequential, stdio 180,000,000 70,000,000
r11-mmapbyte block size 1, sequential, memory-mapped I/O 620,000,000 310,000,000
r06-stridebyte block size 1, stride 220, syscalls 1,400,000 940,000
r08-stridestdiobyte block size 1, stride 220, stdio 900,000 610,000
r10-stridestdiomulti block size 1, stride 220, stdio, 10 files (one per stride) 63,000,000 20,000,000
r13-mmapstridebyte block size 1, stride 220, memory-mapped I/O 160,000,000 200,000,000

4096-byte blocks

Test Description Rate (byte/sec, iMac) Rate (byte/sec, ThinkPad)
w02-syncblock4096 block size 4096, sequential, syscalls, O_SYNC 16,000,000 110,000
w04-block4096 block size 4096, sequential, syscalls 980,000,000 450,000,000
w06-stdioblock4096 block size 4096, sequential, stdio 970,000,000 450,000,000
r02-block4096 block size 4096, sequential, syscalls 4,000,000,000 640,000,000
r05-stdioblock4096 block size 4096, sequential, stdio 3,400,000,000 1,400,000,000
r12-mmapblock4096 block size 4096, sequential, memory-mapped I/O 2,300,000,000 620,000,000
r07-strideblock4096 block size 4096, stride 220, syscalls 3,100,000,000 560,000,000
r09-stridestdioblock4096 block size 4096, stride 220, stdio 2,600,000,000 800,000,000

Running your own tests

For running experiments it’s convenient to have programs that take options. We’ve provided these too, in the cs61-lectures/l18 directory—the writer, reader, and memreader programs. Their options are:

-n SIZE
Number of bytes to read or write (defaults to 10 << 20).
-b BLOCKSIZE
Block size in bytes (defaults to 1, i.e., byte-at-a-time I/O).
-s STRIDE
Stride in bytes. Defaults to 0, i.e., sequential I/O. If you set -s equal to -b, this has the same effect as sequential I/O, but contains the same number of fseek or lseek calls as a strided access pattern. You can use this to distinguish the cost of the additional system calls from the cost of the strided access pattern.
-S USESTDIO
1 to use stdio, 0 to use syscalls (the default). (writer and reader only)
-y USESYNC
1 to open the file with O_SYNC. 0 is the default. (writer only)
-p PRINTFREQUENCY
Print a progress message every PRINTFREQUENCY bytes. Defaults to 0, which means don’t print a message until the test completes.
FILENAME
File to read or write (defaults to data).

Here are the l18 command lines corresponding to l17 programs. The l18 programs have more complicated and can thus run slower than the l17 programs. (This is particularly clear for byte-at-a-time memory-mapped I/O.)

l17 program l18 command line
w01-syncbyte ./writer -y 1
w02-syncblock ./writer -b 512 -y 1
w02-syncblock4096 ./writer -b 4096 -y 1
w03-byte ./writer
w04-block ./writer -b 512
w04-block4096 ./writer -b 4096
w05-stdiobyte ./writer -S 1
w06-stdioblock ./writer -S 1 -b 512
w06-stdioblock4096 ./writer -S 1 -b 4096
r01-byte ./reader
r02-block ./reader -b 512
r02-block4096 ./reader -b 4096
r04-stdiobyte ./reader -S 1
r05-stdioblock ./reader -S 1 -b 512
r05-stdioblock4096 ./reader -S 1 -b 4096
r06-stridebyte ./reader -s 1048576
r07-strideblock ./reader -s 1048576 -b 512
r07-strideblock4096 ./reader -s 1048576 -b 4096
r08-stridestdiobyte ./reader -S 1 -s 1048576
r09-stridestdioblock ./reader -S 1 -s 1048576 -b 512
r09-stridestdioblock4096 ./reader -S 1 -s 1048576 -b 4096
r10-stridestdiomulti N/A
r11-mmapbyte ./memreader
r12-mmapblock ./memreader -b 512
r12-mmapblock4096 ./memreader -b 4096
r13-mmapstridebyte ./memreader -s 1048576