Lecture 1 Notes

Here are some results from running the vector and list examples from class on problems of various sizes.

Problem

First, insert N random integers into a sorted data structure. Then, N times, select and remove the integer at a randomly-chosen sort position. I.e., if we select position 0, that means remove the smallest integer in the data structure.

Data structures

"List" is a singly-linked list, "vector" a contiguously-allocated vector, and "tree" a balanced binary search tree (a left-leaning red-black tree, to be precise) with additional information to allow us to select items by position. We didn't discuss the tree in class.

Both insertion and removal on the list and vector data structures are O(N)-time operations. The list has one O(N)-time operation per insertion or removal (namely, traversing to the desired position), while the vector appears to have two or three (traversing to the desired position, optionally expanding space for the vector, and then shifting items around). The tree has only O(log N)-time operations, which will matter a lot for larger N.

Code for all these is available to anyone via Git at http://code.seas.harvard.edu/cs61/cs61-lectures. The experiment.pl script produced the graph above (run it as make; perl experiment.pl | gnuplot > output.png).

Results

The lines below are the result of running each program at each size for between 6 and 10 trials. The machine’s a pretty recent iMac with 16G RAM. The minimum and maximum trials were discarded. Note that both axes use log scale. Smaller values are better (since the experiment took less time).

The vector beats the list for every N tested (except N=10, which looks like an anomaly), often by a ton. The vector wins overall for N≤2000, after which the tree’s log N scaling decisively wins.

Zooming in to small values of N:

Questions

The "vector" line above doesn't use the sum trick from class. Although the remove phase's code looks like it contains two O(N) phases (one to traverse to the position and one to shift the data), in fact the compiler optimizes the traversal phase to a single piece of arithmetic. Try downloading the code yourself, then write code so that the compiler really does access vector data during the traversal phase. What code did you have to change, and how? (Some compilers are smarter than others, so you might have to do pretty complicated stuff!) What difference did it make to performance?
The vector's insertion phase could potentially be sped up using binary search. Try implementing this. What difference did it make to performance?

Credit

I read about an experiment like this in Bjarne Stroustrup’s article "Software Development for Infrastructure."