Overview
This course investigates how systems software works: what makes programs work fast or slow, and how properties of the machines we program impact the programs we write. We discuss both general ideas and specific tools, and take an experimental approach.
Class outline
- Data representation
- How do computers represent different kinds of information?
- How do data representation choices impact performance and correctness?
- Assembly & machine programming
- What kind of language is understood by computer processors?
- How is code you write translated to code a processor runs?
- Kernel programming
- How do hardware and software defend against bugs and attacks?
- How are operating systems interfaces implemented?
- Storage & caching
- What kinds of computer data storage are available, and how do they perform?
- How can we improve the performance of a system that stores data?
- Process management
- How can programs on the same computer cooperate and interact?
- What kinds of operating systems interfaces are useful?
- Concurrency
- How can a single program safely use multiple processors?
- How can multiple computers safely interact over a network?
Your work
- Six problem sets
- Midterm and final
- Section
- Starting mid-next week
- Attendance checked for simultaneously-enrolled students
Your grade
- Rough breakdown: >50% assignments, <35% tests, 15% participation
- Course grading: A means mastery
Collaboration
Discussion, collaboration, and the exchange of ideas are essential to doing academic work, and to engineering. You are encouraged to consult with your classmates as you work on problem sets. You are welcome to discuss general strategies for solutions as well as specific bugs and code structure questions, and to use Internet resources for general information.
However, the work you turn in must be your own—the result of your own efforts. You should understand your code well enough that you could replicate your solution from scratch, without collaboration.
In addition, you must cite any books, articles, online resources, and so forth that helped you with your work, using appropriate citation practices; and you must list the names of students with whom you have collaborated on problem sets and briefly describe how you collaborated. (You do not need to list course staff.)
Our programming language
We use the C++ programming language in this class.
C++ is a boring, old, and unsafe programming language, but boring languages are underrated. C++ offers several important advantages for this class, including ubiquitous availability, good tooling, the ability to demonstrate impactful errors, and a good standard library of data structures.
Pset 0 links to several C++ tutorials and references, and to a textbook.
Today
- Objects and values
- A performance mystery
A simple program
#include <cstdio>
int main() {
int x = 0;
fprintf(stdout, "%d\n", x);
x = x + 1;
fprintf(stdout, "%d\n", x);
}
An equivalent program
#include <cstdio>
int main() {
fprintf(stdout, "%d\n", 0);
fprintf(stdout, "%d\n", 1);
}
Derpsy doodle
#include <cstdio>
int main() {
fprintf(stdout, "%d\n", 0);
0 = 0 + 1;
fprintf(stdout, "%d\n", 0);
}
Primitive types, values, and objects
-
A type defines a set of related values in a programming language.
-
A primitive type is irreducible, meaning its values aren’t composed of smaller values.
- Different programming languages have different primitive types.
- In C++, the primitive types include integers, like 0 and 1; floating
point numbers, like 0.5 and
INFINITY
; booleanstrue
andfalse
; and pointers, which we’ll see later.
-
An object is a value stored in memory.
- The standard says “a region of data storage in the execution environment, the contents of which can represent values”.
- Objects, unlike values, can change.
- Because, unlike values, they are present somewhere in the real physical world!
Where’s x
?
From https://www.ifixit.com/News/46884/m1-macbook-teardowns-something-old-something-new
Computer addresses
- Computer software in execution deals with zeroes and ones: bits
- Bits are grouped into groups of eight called bytes
- A byte is a kind of integer with a value between 0 and 255
0b00000000
is the byte0
0b00000001 == 1
is the byte1
- And the bytes are stored in memory
- In the physical world
- Memory is made up of billions and billions of transistors—“MOSFETs” (metal–oxide–semiconductor field-effect transistors: don’t ask)
- We need a way to refer to specific bytes of memory
- So we can refer to an object—or change it
- Which MOSFETs contain the bytes that make up
x
?
- Each byte of memory has an address
- Which is stored as another integer!
Finding x
#include <cstdio>
int main() {
int x = 0;
fprintf(stdout, "%d %p\n", x, &x);
x = x + 1;
fprintf(stdout, "%d %p\n", x, &x);
}
Finding x
with hexdump
#include <cstdio>
#include "hexdump.hh"
int main() {
int x = 0;
fprintf(stdout, "%d\n", x);
hexdump_object(x);
x = x + 1;
fprintf(stdout, "%d\n", x);
hexdump_object(x);
}
A performance mystery
Sorting integers
$ cd cs61-lectures/datarep1
$ make
$ ./intgen | python3 intsort.py
0
1
2
3
4
5
6
7
8
9
intsort.py
import sys
ls = []
for line in sys.stdin:
val = int(line)
i = 0
while i < len(ls) and ls[i] < val:
i += 1
ls.insert(i, val)
for v in ls:
sys.stdout.write("{}\n".format(v))
Questions
- Does this work?
- How would you test it?
- How fast is it?
intsort.cc
#include <cstdio>
#include <cassert>
#include <list>
#include <vector>
#include "hexdump.hh"
int main() {
std::list<int> ls;
// read integers from stdin, storing them in sorted order
int val;
while (fscanf(stdin, "%d", &val) == 1) {
auto it = ls.begin();
while (it != ls.end() && *it < val) {
++it;
}
ls.insert(it, val);
}
// print integers in sorted order
for (auto v : ls) {
fprintf(stdout, "%d\n", v);
}
}
intsort.cc
mark 2
- std::list<int> ls;
+ std::vector<int> ls;