Overview
We explore library specifications and pointer, address, and integer representations, and introduce integer undefined behavior.
Arena allocator review
- An arena allocator is an object whose purpose is to handle the allocation and deallocation of other objects
- Arena allocators can improve performance relative to general-purpose
allocators
- Example: many short-term allocations of small objects
Example
struct node_arena {
std::vector<node*> scratch_;
node* allocate() {
node* n;
if (this->scratch_.empty()) {
n = new node;
} else {
n = this->scratch_.back();
this->scratch_.pop_back();
}
return n;
}
void deallocate(node* n) {
this->scratch_.push_back(n);
}
};
1.92x faster on Mac OS X (1.13s → 0.59s), 1.25x on Docker (0.51s → 0.41s)
Libraries and abstraction boundaries
- A library implements a contract called a specification
- A specification imposes requirements on both the users of the library and the implementers of the library
- If the user obeys their requirements, then the library must obey its requirements
- It can be written in a formal language or in precise human language
- Good software engineering practice
- An allocator is a simple library; how would you write its specification?
struct node_arena {
node* allocate();
void deallocate(node* n);
~node_arena();
};
Allocator specification
struct node_arena {
// Return a pointer to a freshly allocated `node`.
node* allocate();
// Deallocate `n`, which must refer to a live `node` that was
// returned by a previous call to `allocate()`.
void deallocate(node* n);
// Free any storage used for allocated `node`s. All `node`s allocated
// by this arena must be passed to `deallocate` before the arena
// is destroyed.
~node_arena();
};
Specifications are partial
struct node_arena {
node* allocate() {
return new node;
}
void deallocate(node* n) {
delete n;
}
}
struct node_arena {
std::vector<node*> scratch_;
node* allocate() {
node* n;
if (scratch_.empty()) {
n = new node;
} else {
n = scratch_.back();
scratch_.pop_back();
}
return n;
}
void deallocate(node* n) {
scratch_.push_back(n);
}
~node_area() { ... }
}
struct node_arena {
node* allocate() {
return new node[10];
}
void deallocate(node* n) {
delete[] n;
}
}
Per-allocation statistics?
-
Goal: Count how often each node is reused
- Maybe some nodes are used more often than others
void print_uses() { for (auto n : ???) { printf("%zu uses @%p\n", ???, n); } }
How to store per-allocation information
- Store information in an external data structure
- Example:
std::map<node*, size_t> uses_
- Advantage: Simple to think about
- Disadvantage: More expensive
- Example:
- Store information internal to each allocation
- Allocate more space than user requested
- Use the extra space for internal purposes
- Advantage: Speed
- Disadvantage: Complexity
Internal metadata
struct statnode {
node real_node;
size_t uses = 0;
};
std::vector<statnode*> scratch_;
node* allocate() {
statnode* n;
if (this->scratch_.empty()) {
n = new statnode;
} else {
n = this->scratch_.back();
this->scratch_.pop_back();
}
++n->uses;
return (node*) n;
}
void deallocate(node* n) {
this->scratch_.push_back((statnode*) n);
}
void print_uses() {
for (auto n : this->scratch_) {
printf("%zu uses @%p\n", n->uses, n);
}
}
Why does this work?
struct statnode {
node real_node;
size_t uses = 0;
};
node* allocate() {
statnode* n = ...;
return (node*) n;
}
void deallocate(node* n) {
this->scratch_.push_back((statnode*) n);
}
Pointer representation and address representation
- On our machine, the representation of a pointer equals the representation of the address that holds the corresponding object
- Can cast pointers to and from address type
- The type of an address is
uintptr_t
, which is usually a name forunsigned long
- The type of an address is
- If two objects have the same address, can cast one type of pointer to the other
Extra space first?
struct statnode {
size_t uses = 0;
node real_node;
};
Address computation
// `statnode` → `node`
statnode* sn = ...;
node* n = (node*) ((uintptr_t) sn + offsetof(statnode, real_node));
// `node` → `statnode`
node* n = ...;
statnode* n = (statnode*) ((uintptr_t) n - offsetof(statnode, real_node));
Pointer arithmetic
- Pointers and arrays are conceptually linked
- We can do arithmetic on pointers into an array
- The results are the same as doing arithmetic on array indexes
Pointer comparisons
T a[N];
int i, j;
T* p1 = &a[i];
T* p2 = &a[j];
p1 == p2 ⇔ i == j
p1 != p2 ⇔ i != j
p1 < p2 ⇔ i < j
p1 > p2 ⇔ i > j
p1 <= p2 ⇔ i <= j
p1 >= p2 ⇔ i >= j
Pointer addition and subtraction
T a[N];
int i, j, k;
T* p1 = &a[i];
T* p2 = &a[j];
p1 + k == &a[i] + k == &a[i + k]
p1 - k == &a[i] - k == &a[i - k]
p1 - p2 == &a[i] - &a[j] == i - j // type of `p1 - p2` is `ptrdiff_t` == `long`
p1 += k == p1 = p1 + k == p1 = &a[i + k]
++p1 == p1 = p1 + 1 == p1 = &a[i + 1]
Valid indexes
- For an array of size
N
:- Can form pointers
&a[0]
…&a[N]
- Can dereference pointers
&a[0]
…&a[N-1]
- Can form pointers
<
,>
,<=
,>=
,-
are valid with pointers into the same array- Any single (non-array) object can be treated as an array of size 1
Pointer arithmetic vs. address arithmetic
T a[N];
int i, j;
T* p1 = &a[i];
T* p2 = &a[j];
uintptr_t diff1 = (uintptr_t) p1 - (uintptr_t) p2;
uintptr_t diff2 = (uintptr_t) (p1 - p2);
Which is bigger, diff1
or diff2
?
Pointer arithmetic vs. address arithmetic (concretely)
- Make it concrete:
T
→int
,i
→1
,j
→0
int a[2] = {61, 62};
int* p1 = &a[1];
int* p2 = &a[0];
uintptr_t diff1 = (uintptr_t) p1 - (uintptr_t) p2;
uintptr_t diff2 = (uintptr_t) (p1 - p2);
printf("diff1=%zu, diff2=%zu, p1=%p, p2=%p\n", diff1, diff2, p1, p2);
hexdump_object(a);
$ ./ptrarith
diff1=4, diff2=1, p1=0x7ffee434dac4, p2=0x7ffee434dac0
7ffee434dac0 3d 00 00 00 3e 00 00 00 |=...>...|
- The difference between the address values of
p1
andp2
is 4- Because the objects
a[0]
anda[1]
are 4 bytes big, objects cannot overlap, and arrays are laid out sequentially
- Because the objects
diff1
computes this difference in address values- By converting each pointer to an address before subtracting
- Equals the size of the element times the difference in indexes
diff2
computes the difference in array indexes- Because it subtracts the pointers, and pointer subtraction is defined to equal index subtraction
- So
diff1 == diff2 * sizeof(T)
- And
diff1 >= diff2
- And
Cleaning up address computations
struct statnode {
size_t uses = 0;
node real_node;
};
// `statnode` → `node`
statnode* sn = ...;
node* n = &sn->real_node;
// `node` → `statnode`
static_assert(offsetof(statnode, real_node) == sizeof(size_t));
node* n = ...;
statnode* n = (statnode*) ((size_t*) n - 1);
Exercise: Figure out why this works!
Integer representation
#include <cstdio>
#include "hexdump.hh"
int main() {
int i = 1;
unsigned u = 1;
hexdump_object(i);
hexdump_object(u);
}
- What about numbers greater than 255?
- What about negative numbers?
Multi-byte integers
- Like Arabic numerals (and many other notations going back to Babylonian cuneiform), computers use place-value notation to represent numbers larger than 255
- Numbers are written in base 256 (because bytes have 8 bits)
- Consider 258 = 256 + 2 = 1×2561 + 2×2560
int x = 258;
hexdump_object(x);
// => 7ffeea26fac8 02 01 00 00 |....|
- Unlike Arabic numerals, x86-64 represents numbers with the ones place “on the left” (at lower addresses)
- This is called little endian
- Other computers make different choices; for example, the layout used in Internet protocols puts the highest-value place at lower addresses
- This is called big endian
Overflow
- Overflow happens when the result of a computation is too large in magnitude to be represented using the chosen datatype
unsigned char n = 255; // range representable in `n` is [0,255]
n = n + 1; // `n` should be 256, but can’t be represented
unsigned u = 1'000'000; // range representable in `u` is [0,4'294'967'295]
u = u * 10'000'000; // `n` should be 10'000'000'000'000, but can’t be represented
Unsigned integer overflow
- In C and C++, when a computation using unsigned integers overflows, the result is taken modulo 2B, where B is the number of bits in the type
- Thus, throw away any upper carry bits
- For integer type
T
, B = 8×sizeof(T)
unsigned char n = 255; // range representable in `n` is [0,255]
n = n + 1; // `n` should be 256, but can’t be represented
unsigned u = 1'000'000; // range representable in `u` is [0,4'294'967'295]
u = u * 10'000'000; // `n` should be 10'000'000'000'000, but can’t be represented
printf("n=%u, u=%u\n", (unsigned) n, u);
// => n = 0, u = 1316134912
// (10'000'000'000'000 == (2**32 * 2328) + 1'316'134'912)
Two’s complement representation
- Principle: Addition and subtraction of signed integers shall use the same representation as addition and subtraction of unsigned integers
- Result:
-x
is represented in the same way as the unsigned number2B - x
, whereB
is the number of bits in the integer - Why?
Two’s complement and overflow
- The negation operator is the additive inverse
- x + (-x) = 0 for all x
- Let
x
be unsigned with B bits- Computer
x
represents mathematical x - Define computer
-x
as the representation of mathematical 2^B - x - Computer
x + -x
then represents x + (2^B - x) - x + (2^B - x) = 2^B \equiv 0 \pmod{2^B}
- So computer
x + -x
equals computer0
and we have a consistent definition of negation!
- Computer
Example
unsigned char b = 1;
unsigned char negb = -1;
assert(negb == 255);
b == 0b 0 0 0 0 0 0 0 1
+ negb == + 0b 1 1 1 1 1 1 1 1
------ -----------------------------
0 0b 1̸ 0 0 0 0 0 0 0 0
Undefined behavior
- Signed integer arithmetic is not allowed to overflow
- In unsigned arithmetic,
4294967295U + 1U == 0U
- In signed arithmetic,
2147483647 + 1 == 💩🚽
- In unsigned arithmetic,
- Consider
./ubsignedinc 2147483647
vs../ubsignedinc-clang-noopt 2147483647