Data representation 4: Pointer and integer arithmetic

Overview

We explore library specifications and pointer, address, and integer representations, and introduce integer undefined behavior.

Arena allocator review

An arena allocator is an object whose purpose is to handle the allocation and deallocation of other objects
Arena allocators can improve performance relative to general-purpose allocators
- Example: many short-term allocations of small objects

Example

datarep4/mb1.cc

struct node_arena {
    std::vector<node*> scratch_;

    node* allocate() {
        node* n;
        if (this->scratch_.empty()) {
            n = new node;
        } else {
            n = this->scratch_.back();
            this->scratch_.pop_back();
        }
        return n;
    }

    void deallocate(node* n) {
        this->scratch_.push_back(n);
    }
};

1.92x faster on Mac OS X (1.13s → 0.59s), 1.25x on Docker (0.51s → 0.41s)

Libraries and abstraction boundaries

A library implements a contract called a specification
A specification imposes requirements on both the users of the library and the implementers of the library
If the user obeys their requirements, then the library must obey its requirements
It can be written in a formal language or in precise human language
Good software engineering practice
An allocator is a simple library; how would you write its specification?

struct node_arena {
    node* allocate();
    void deallocate(node* n);
    ~node_arena();
};

Allocator specification

struct node_arena {
    // Return a pointer to a freshly allocated `node`.
    node* allocate();

    // Deallocate `n`, which must refer to a live `node` that was
    // returned by a previous call to `allocate()`.
    void deallocate(node* n);

    // Free any storage used for allocated `node`s. All `node`s allocated
    // by this arena must be passed to `deallocate` before the arena
    // is destroyed.
    ~node_arena();
};

Specifications are partial

struct node_arena {
    node* allocate() {
        return new node;
    }
    void deallocate(node* n) {
        delete n;
    }
}

struct node_arena {
    std::vector<node*> scratch_;
    node* allocate() {
        node* n;
        if (scratch_.empty()) {
            n = new node;
        } else {
            n = scratch_.back();
            scratch_.pop_back();
        }
        return n;
    }
    void deallocate(node* n) {
        scratch_.push_back(n);
    }
    ~node_area() { ... }
}

struct node_arena {
    node* allocate() {
        return new node[10];
    }
    void deallocate(node* n) {
        delete[] n;
    }
}

Per-allocation statistics?

Goal: Count how often each node is reused

Maybe some nodes are used more often than others

void print_uses() {
    for (auto n : ???) {
        printf("%zu uses @%p\n", ???, n);
    }
}

How to store per-allocation information

Store information in an external data structure
- Example: std::map<node*, size_t> uses_
- Advantage: Simple to think about
- Disadvantage: More expensive
Store information internal to each allocation
- Allocate more space than user requested
- Use the extra space for internal purposes
- Advantage: Speed
- Disadvantage: Complexity

Internal metadata

datarep4/mb2s.cc

struct statnode {
    node real_node;
    size_t uses = 0;
};

std::vector<statnode*> scratch_;

node* allocate() {
    statnode* n;
    if (this->scratch_.empty()) {
        n = new statnode;
    } else {
        n = this->scratch_.back();
        this->scratch_.pop_back();
    }
    ++n->uses;
    return (node*) n;
}

void deallocate(node* n) {
    this->scratch_.push_back((statnode*) n);
}

void print_uses() {
    for (auto n : this->scratch_) {
        printf("%zu uses @%p\n", n->uses, n);
    }
}

Why does this work?

struct statnode {
    node real_node;
    size_t uses = 0;
};

node* allocate() {
    statnode* n = ...;
    return (node*) n;
}

void deallocate(node* n) {
    this->scratch_.push_back((statnode*) n);
}

Pointer representation and address representation

On our machine, the representation of a pointer equals the representation of the address that holds the corresponding object
Can cast pointers to and from address type
- The type of an address is uintptr_t, which is usually a name for unsigned long
If two objects have the same address, can cast one type of pointer to the other

Extra space first?

struct statnode {
    size_t uses = 0;
    node real_node;
};

Address computation

// `statnode` → `node`
statnode* sn = ...;
node* n = (node*) ((uintptr_t) sn + offsetof(statnode, real_node));

// `node` → `statnode`
node* n = ...;
statnode* n = (statnode*) ((uintptr_t) n - offsetof(statnode, real_node));

Pointer arithmetic

Pointers and arrays are conceptually linked
We can do arithmetic on pointers into an array
The results are the same as doing arithmetic on array indexes

Pointer comparisons

T a[N];
int i, j;

T* p1 = &a[i];
T* p2 = &a[j];

p1 == p2        ⇔        i == j
p1 != p2        ⇔        i != j
p1 < p2         ⇔        i < j
p1 > p2         ⇔        i > j
p1 <= p2        ⇔        i <= j
p1 >= p2        ⇔        i >= j

Pointer addition and subtraction

T a[N];
int i, j, k;

T* p1 = &a[i];
T* p2 = &a[j];

p1 + k     ==     &a[i] + k     ==    &a[i + k]
p1 - k     ==     &a[i] - k     ==    &a[i - k]
p1 - p2    ==   &a[i] - &a[j]   ==    i - j        // type of `p1 - p2` is `ptrdiff_t` == `long`

p1 += k    ==    p1 = p1 + k    ==    p1 = &a[i + k]
++p1       ==    p1 = p1 + 1    ==    p1 = &a[i + 1]

Valid indexes

For an array of size N:
- Can form pointers &a[0]…&a[N]
- Can dereference pointers &a[0]…&a[N-1]
<, >, <=, >=, - are valid with pointers into the same array
Any single (non-array) object can be treated as an array of size 1

Pointer arithmetic vs. address arithmetic

T a[N];
int i, j;

T* p1 = &a[i];
T* p2 = &a[j];

uintptr_t diff1 = (uintptr_t) p1 - (uintptr_t) p2;
uintptr_t diff2 = (uintptr_t) (p1 - p2);

Which is bigger, diff1 or diff2?

Pointer arithmetic vs. address arithmetic (concretely)

Make it concrete: T→int, i→1, j→0

datarep4/ptrarith.cc

int a[2] = {61, 62};

int* p1 = &a[1];
int* p2 = &a[0];

uintptr_t diff1 = (uintptr_t) p1 - (uintptr_t) p2;
uintptr_t diff2 = (uintptr_t) (p1 - p2);

printf("diff1=%zu, diff2=%zu, p1=%p, p2=%p\n", diff1, diff2, p1, p2);
hexdump_object(a);

$ ./ptrarith
diff1=4, diff2=1, p1=0x7ffee434dac4, p2=0x7ffee434dac0
7ffee434dac0  3d 00 00 00 3e 00 00 00                           |=...>...|

The difference between the address values of p1 and p2 is 4
- Because the objects a[0] and a[1] are 4 bytes big, objects cannot overlap, and arrays are laid out sequentially
diff1 computes this difference in address values
- By converting each pointer to an address before subtracting
- Equals the size of the element times the difference in indexes
diff2 computes the difference in array indexes
- Because it subtracts the pointers, and pointer subtraction is defined to equal index subtraction
So diff1 == diff2 * sizeof(T)
- And diff1 >= diff2

Cleaning up address computations

struct statnode {
    size_t uses = 0;
    node real_node;
};

// `statnode` → `node`
statnode* sn = ...;
node* n = &sn->real_node;

// `node` → `statnode`
static_assert(offsetof(statnode, real_node) == sizeof(size_t));
node* n = ...;
statnode* n = (statnode*) ((size_t*) n - 1);

Exercise: Figure out why this works!

Integer representation

datarep4/intrep.cc

#include <cstdio>
#include "hexdump.hh"

int main() {
    int i = 1;
    unsigned u = 1;

    hexdump_object(i);
    hexdump_object(u);
}

What about numbers greater than 255?
What about negative numbers?

Multi-byte integers

Like Arabic numerals (and many other notations going back to Babylonian cuneiform), computers use place-value notation to represent numbers larger than 255
Numbers are written in base 256 (because bytes have 8 bits)
Consider 258 = 256 + 2 = 1×256¹ + 2×256⁰

int x = 258;
hexdump_object(x);

// => 7ffeea26fac8  02 01 00 00                                       |....|

Unlike Arabic numerals, x86-64 represents numbers with the ones place “on the left” (at lower addresses)
- This is called little endian
Other computers make different choices; for example, the layout used in Internet protocols puts the highest-value place at lower addresses
- This is called big endian

Overflow

Overflow happens when the result of a computation is too large in magnitude to be represented using the chosen datatype

unsigned char n = 255;   // range representable in `n` is [0,255]
n = n + 1;               // `n` should be 256, but can’t be represented

unsigned u = 1'000'000;  // range representable in `u` is [0,4'294'967'295]
u = u * 10'000'000;      // `n` should be 10'000'000'000'000, but can’t be represented

Unsigned integer overflow

In C and C++, when a computation using unsigned integers overflows, the result is taken modulo 2^B, where B is the number of bits in the type
- Thus, throw away any upper carry bits
- For integer type T, B = 8×sizeof(T)

unsigned char n = 255;   // range representable in `n` is [0,255]
n = n + 1;               // `n` should be 256, but can’t be represented

unsigned u = 1'000'000;  // range representable in `u` is [0,4'294'967'295]
u = u * 10'000'000;      // `n` should be 10'000'000'000'000, but can’t be represented

printf("n=%u, u=%u\n", (unsigned) n, u);
// => n = 0, u = 1316134912
// (10'000'000'000'000 == (2**32 * 2328) + 1'316'134'912)

Two’s complement representation

Principle: Addition and subtraction of signed integers shall use the same representation as addition and subtraction of unsigned integers
Result: -x is represented in the same way as the unsigned number 2^B - x, where B is the number of bits in the integer
Why?

Two’s complement and overflow

The negation operator is the additive inverse
- x + (-x) = 0 for all x
Let x be unsigned with B bits
- Computer x represents mathematical x
- Define computer -x as the representation of mathematical 2^B - x
- Computer x + -x then represents x + (2^B - x)
- x + (2^B - x) = 2^B \equiv 0 \pmod{2^B}
- So computer x + -x equals computer 0 and we have a consistent definition of negation!

Example

unsigned char b = 1;
unsigned char negb = -1;
assert(negb == 255);

        b    ==     0b  0  0  0  0  0  0  0  1
   + negb    ==  +  0b  1  1  1  1  1  1  1  1
   ------        -----------------------------
        0        0b  1̸  0  0  0  0  0  0  0  0

Undefined behavior

Signed integer arithmetic is not allowed to overflow
- In unsigned arithmetic, 4294967295U + 1U == 0U
- In signed arithmetic, 2147483647 + 1 == 💩🚽
Consider ./ubsignedinc 2147483647 vs. ./ubsignedinc-clang-noopt 2147483647