Overview
We summarize some lessons from last time, then investigate alignment, compound type layout, and performance.
Does every object always need an address?
- No
- The language standard says that every object is located in a contiguous region of storage, and has an address
- But the as-if rule says the compiler can implement the program
however it wants, as long as the execution is indistinguishable!
- This rule is what makes optimizations possible
- So when the compiler can tell that an address isn’t needed, it doesn’t assign one
- The programmer can force address assignment with
&
Does every different object need a different address?
- Yes…
- Except an object’s storage can be reused after the object’s lifetime
Lifetime and lifetime errors
- Every C++ object has a lifetime
- It is illegal to refer to an object outside its lifetime
- Lifetime errors exemplify a terrible class of error: violations of memory safety
- When a program violates memory safety, it commits the mortal sin of undefined behavior
- What happens next? Anything!
- Maybe a sanitizer error (
stack-use-after-scope
,heap-use-after-free
)! (Best case) - Maybe immediate crash (segmentation fault)!
- Maybe it keeps going as if nothing happened!
- Maybe it transfers all data on the computer to Palantir!
datarep-addr/badlifetime.cc
Are different address ranges used for different purposes?
- Yes
- Lifetimes fall into a few categories called storage durations
- Static storage duration: The object survives for the length of the program
- Automatic storage duration: The object survives until its containing function returns (or the containing block ends)
- Dynamic storage duration: The object survives until it is explicitly deleted (
malloc
andfree
, ornew
anddelete
)
- Objects with different storage durations are stored in regions of memory called segments
Segment Storage duration Object declaration Code (aka text) Static Global (read-only data) Data Static Global (read/write data) Heap Dynamic Dynamic Stack Automatic Local - Can a segment run out of space?
datarep-addr/outofstack.cc
,datarep-addr/outofheap.cc
More questions
- Are there any constraints on the addresses for particular objects?
- How are complex objects represented in memory?
- Do addresses have any consequences for performance?
- Let’s design experiments to test some hypotheses!
Alignment
- Each C++ type has an alignment
- On x86-64, either 1, 2, 4, 8, or 16
- Every object of that type has an address that is a multiple of the type’s alignment
- May require padding
intsort.cc
#include <cstdio>
#include <cassert>
#include <list>
#include "print_bytes.hh"
int main() {
std::list<int> ints;
// read integers from stdin, storing them in sorted order
int input;
while (fscanf(stdin, "%d", &input) == 1) {
auto it = ints.begin();
while (it != ints.end() && *it < input) {
++it;
}
ints.insert(it, input);
}
// print integers in sorted order
for (auto& value : ints) {
printf("%d\n", value);
}
}
intsort.cc
mark 2
- std::list<int> ls;
+ std::vector<int> ls;
What happened????
Here are some results from several runs of ./intgen -n 100000 | time ./intsort > /dev/null
(on my desktop in Docker):
Data structure | Elapsed times | ||
---|---|---|---|
std::list |
27.48 sec | 27.45 sec | 27.42 sec |
std::vector |
0.73 sec | 0.73 sec | 0.75 sec |
An investigation
-
Let’s print out the integers’ addresses as well!
for (auto& value : ints) { print_object(value); }
-
Sample start of output for
std::list
and./intgen -n 50000
:5626'6ed5'09d0 00 00 00 00 5626'6eda'8d50 01 00 00 00 5626'6ed5'6b90 02 00 00 00 5626'6edd'18f0 03 00 00 00 5626'6ec6'f090 04 00 00 00
-
Sample start of output for
std::vector
and./intgen -n 50000
:7fe0'eb6c'6010 00 00 00 00 7fe0'eb6c'6014 01 00 00 00 7fe0'eb6c'6018 02 00 00 00 7fe0'eb6c'601c 03 00 00 00 7fe0'eb6c'6020 04 00 00 00
Analysis
std::vector
elements’ memory addresses are contiguous: next to one another, no gapsstd::list
elements’ memory address are not contiguous: big jumps, bigger the more integers there are- Contiguous memory is faster to access!
- Cache memory (unit 4)
- But why do
std::list
addresses bounce around so much?- Discuss!
Insertion order
Data structure | Insertion order | Elapsed times | ||
---|---|---|---|---|
std::list |
-r (random) |
27.48 sec | 27.45 sec | 27.42 sec |
std::vector |
-r |
0.73 sec | 0.73 sec | 0.75 sec |
std::list |
-u (increasing) |
7.43 sec | 7.43 sec | 7.36 sec |
std::vector |
-u |
1.16 sec | 1.14 sec | 1.15 sec |
std::list |
-d (decreasing) |
0.01 sec | 0.01 sec | 0.01 sec |
std::vector |
-d |
0.35 sec | 0.33 sec | 0.34 sec |
- The elements of a
std::list
are laid out in memory in increasing order by insertion time!- This means that when the input integers arrive in random order, traversing the list to find an insertion position involves many jumps to increasingly far-away points in memory—very bad for performance.
- But when input integers arrive in sequential order, then adjacent integers have nearby addresses—good for performance!
- Best for performance, though, is when no traversal is necessary at all (
-d
).
- The elements of a
std::vector
are laid out in memory in increasing order by position.- This requires moving elements around when inserting new elements, but we make up for that with efficient memory traversal.
- Only when inserting in descending order does
std::list
win.
- Can we do better still?