Lecture 1 Notes: Introduction
Course Information
CS61: Computer Systems & Machine Organization
Website: cs61.seas.harvard.edu
Goals of the course
Ultimately, this course aims to show students how to program while focusing on the following traits:
- Time (speed)
- Memory
- Performance
Programming Problem
Problem
You are given an input stream of random numbers, and an input stream of positions. Using the position i as the index into a sorted array containing the random numbers, output the ith biggest number. Remove the ith number from the list, and repeat.
- Inputs
- Input stream of random numbers, placed into a sorted array.
- Positions.
- Output
- ith biggest output.
Example
- Inputs
- stream into array
- 5 9 0 4 -----------> sortedArray = [0, 4, 5, 9]
- positions
- 3 1 0 0
- stream into array
- Outputs
- remove sortedArray[3] = 9; sortedArray = [0, 4, 5]
- remove sortedArray[1] = 4; sortedArray = [0, 5]
- remove sortedArray[0] = 0; sortedArray = [5]
- remove sortedArray[0] = 5; sortedArray = []
The final output is 9 4 0 5.
What do we need?
Array
[]
[5]
[5, 9]
[0, 5, 9]
Operations:
- expand size
- cost: O(n) where n is the length of the array
- shift entries
- cost O(n)
Final size: N. This is because each time we expand, we double the size of the array:
0 -> 1 -> 2 -> 4 -> 8 -> ...
How much work did we do to add N items?
O(1) + O(2) + O(4) + O(8) + ... + O(2 ^ floor(log (N - 1))) = O(N)
List
head -> NULL (empty list)
head -> 5 -> NULL
head -> 5 -> 9 -> NULL
head -> 0 -> 5 -> 9 -> NULL
head -> 0 -> 4 -> 5 -> 9 -> NULL
Add to the list in order:
- Adjust pointers to insert into a list
- Cost: O(n) work to traverse list (needed for accessing a position within a list, and insertion)
Traversal and its performance
In the add function, look at the while loop’s condition. The program stops as soon as we find the proper place to perform the insertion.
Performance of traversal using an array (vector):
- 50,000: 1 sec
- 100,000: 5 sec
Performance of traversal using a list:
- 50,000: 9 sec
- 100,000: 41 sec
Why is the array (vector) implementation so much faster than the linked list version?
Vector removal code also uses a traversal like the linked list implementation, but the compiler may remove the traversal loop, since it’s smart enough to recognize that the loop is useless with an array. How can I tell if the compiler performed the optimization?
The best way to figure this out is to disassemble the binary that the compiler produced and look at the machine instructions. If the machine has the instructions for the while loop, then the compiler did not perform the optimization.
$ objdump -d outputs the compiled binary as machine instructions.
Because there are no “jump” instructions (which are always used for representing loops), the compiler must have performed the optimization and removed the loop to increment cur_position.
In this code, the vector is a structure with three components:
- size - number of elements in the vector
- capacity - size of the array holding the vector
- data
- pointer to an array of capacity items, of which there are size items
Despite not optimizing the loop, the vector (array) implementation is still much faster than the linked list version.
Story Time
Smoking hot genius and the elderly librarian.
- smoking hot genius = CPU
- elderly librarian = memory controller
- CPU can’t move close to memory, so he relies on elderly librarian
Memory is accessed via addresses (numbers, pointers).
- linked list only knows the address of the first number
- memory accesses are very expensive in time!!!!!
'The vector (array) implementation is a lot faster because it uses fewer memory accesses (we can get the address of the data to remove by simple pointer arithmetic from the start of the array).'
The elderly librarian notices that the cpu is asking for memory at sequential addresses
- the librarian places the data, and extra data nearby, into a cache close to CPU
- the CPU can quickly access the returned items without further memory accesses
Syllabus
Grade Breakdown
- Midterm - 20%
- Final - 25%
- Assignments - 45%
- Scribe notes/Piazza - 10%
Policies
This is the first systems programming course. Goal is for everybody to be able to be high-craft programmers, and to be able to write high-performance code.
There will be a midterm and a final, both of which are open-book and open-computer (no Internet though). There are ~ 6 programming assignments, most of which will be in C. The first assignment will be given out Thursday (9/6). The last few psets may be optionally be done in pairs. Assignments will probably be due Thursday at midnight.
You will have 72 “late hours” to use on assignments over the course of the semester. Any assignments turned in late after all the late hours have been used will incur a significant penalty every 4 hours until the assignment’s grade is an F. Note that turning in a very late assignment is better than not turning an assignment at all.
The TFs have been taking notes while we have been talking :)
Sections will be held weekly, starting next week (9/11). They are not mandatory, but are definitely encouraged.
Students will be expected to take lecture notes (aka “scribe notes”) for at least one lecture. This is part of the participation grade.
For more information, see the course website.