## Overview

Last time, we examined the way C++ compilers assign sizes to objects, and the
way they lay out compound objects in memory. This time we explore struct
alignment in more depth and introduce pointer arithmetic and **segments**
(general areas of memory).

## Objects in memory

- Every object occupies a
**contiguous**range of memory - Objects that might exist at the same time occupy
**disjoint**ranges of memory- Live objects never overlap

## Alignments and compound types

- Hardware (and the compiler) impose
**alignment restrictions**on primitive types- Whenever an object of type
`T`

is stored in memory, its address must be a multiple of`alignof(T)`

- Standard says alignments must be powers of two
- Therefore every alignment is an integral multiple of all smaller alignments

- Whenever an object of type
- The members of a compound object must obey alignment restrictions
- This imposes alignment requirements on compound types
- Array:
`alignof(T[N])`

=`alignof(T)`

- Struct:
`alignof(struct { T`

= max_{1}m_{1}; …; T_{N}m_{N}; })_{I}`alignof(T`

_{I})

- Array:
- For structs, alignment can require adding
**padding**between some members- You can change a struct’s size by reordering its members!

## Struct alignment practice

```
struct s {
char a;
T1 b;
T2 c;
T3 d;
};
```

`T1`

,`T2`

, and`T3`

are some permutation of`unsigned char`

,`int`

, and`short`

.- Can you make
`s`

have size 8? - Can you make
`s`

have alignment 8?

## Struct rule, mathematically

Given

`struct { T`

:_{1}m_{1}; T_{2}m_{2}; … T_{N}m_{N}; } s`alignof(s)`

= max_{I}`alignof(T`

_{I})A helper function,

`offsetof`

, computes the offset of each member relative to the address of the struct. The members are laid out in order, starting at the beginning of the struct, with minimal padding added between members to respect member alignments.`offsetof(s, m`

= 0_{1})For`I>1`

,`offsetof(s, m`

=_{I})

×⎡

⎢

⎢`offsetof(s, m`

+_{I-1})`sizeof(T`

_{I-1})`alignof(T`

_{I})⎤

⎥

⎥`alignof(T`

_{I})(The expression \lceil X / Y \rceil \times Y rounds up X to the nearest multiple of Y.) Note

`offsetof(s, m`

is always a multiple of_{I})`alignof(T`

, as we require._{I})Finally, we can compute the size of the whole structure:

`sizeof(s)`

=⎡

⎢

⎢`offsetof(s, m`

+_{N})`sizeof(T`

_{N})`alignof(s)`

⎤

⎥

⎥`alignof(s)`

## Alignment and `malloc`

- Malloc rule: Any non-null pointer returned by
`malloc`

and friends has the maximum alignment of any type (`alignof(std::max_align_t)`

)- On x86-64, this is 16
- C++
`new`

need not follow this rule;`new T`

returns a pointer aligned for`T`

- Standard vs. ABI
- All C++ implementations on all machines lay out struct members sequentially (standard requirement)
- Padding minimization is not required by the standard, but is required on x86-64 Linux (application-binary interface)

## Pointer representation and address representation

- On our machine, the representation of a pointer equals the representation of
the
**address**that holds the corresponding object - Can cast pointers to and from address type
- The type of an address is
`uintptr_t`

- The type of an address is
- If two objects have the same address, can cast one type of pointer to the other
- This can happen with struct members and other aliases

## Pointer arithmetic

- Pointers and arrays are conceptually linked
- We can do
**arithmetic**on pointers into an array - The results are the same as doing arithmetic on
**array indexes** - This idea generalized gave rise to C++ iterators!

## Pointer comparisons

```
T a[N];
int i = ..., j = ...;
T* p1 = &a[i];
T* p2 = &a[j];
p1 == p2 ⇔ i == j
p1 != p2 ⇔ i != j
p1 < p2 ⇔ i < j
p1 > p2 ⇔ i > j
p1 <= p2 ⇔ i <= j
p1 >= p2 ⇔ i >= j
```

## Pointer addition and subtraction

```
T a[N];
int i = ..., j = ..., k = ...;
T* p1 = &a[i];
T* p2 = &a[j];
p1 + k == &a[i] + k == &a[i + k]
p1 - k == &a[i] - k == &a[i - k]
p1 - p2 == &a[i] - &a[j] == i - j // type of `p1 - p2` is `ptrdiff_t` == `long`
p1 += k == p1 = p1 + k == p1 = &a[i + k]
++p1 == p1 = p1 + 1 == p1 = &a[i + 1]
```

## Valid indexes

- For an array of size
`N`

:- Can
**form**pointers`&a[0]`

…`&a[N]`

- Can
**dereference**pointers`&a[0]`

…`&a[N-1]`

- Can
`<`

,`>`

,`<=`

,`>=`

,`-`

are valid with pointers into the same array- Any single (non-array) object can be treated as an array of size 1

## Pointer arithmetic vs. address arithmetic

- Given
`int a[2]; int* p1 = &a[0]; int* p2 = &a[1];`

- The difference between the address values of
`p1`

and`p2`

is 4- Because the objects
`a[0]`

and`a[1]`

are 4 bytes big, objects cannot overlap, and arrays are laid out sequentially

- Because the objects
- If we convert each pointer to an address and subtract
- Equals the size of the element times the difference in indexes

- If we subtract the pointers
- Answer defined to equal difference in
**indexes**

- Answer defined to equal difference in
- So
`addrdiff == ptrdiff * sizeof(T)`

- And
`addrdiff >= ptrdiff`

- And

## Union types

```
int main() {
union {
int a;
int b;
char c;
char d;
} u;
u.a = 61;
hexdump_object(u);
}
```

**Union**types define objects that might have one of several different underlying representations- One active member at a time, defined by most recent assignment
- Useful for special cases

- Given
`union { T`

:_{1}m_{1}; T_{2}m_{2}; … T_{N}m_{N}; } u

`u.m`_{I}

) = addressof(`u`

)`sizeof(u)`

= max_{I}

`sizeof(T`_{I})

`alignof(u)`

= max_{I}

`alignof(T`_{I})

## Segments

- We’ve now seen many objects with many different addresses.
- Can we classify those into groups? Are there any patterns?
- Yes! Groups of addresses are called
**segments**.

- Yes! Groups of addresses are called

## Segment explorer

```
#include <cstdio>
#include "hexdump.hh"
int i1 = 61;
const int i2 = 62;
int main() {
int i3 = 63;
int* i4 = new int{64};
hexdump_named_object(i1);
hexdump_named_object(i2);
hexdump_named_object(i3);
hexdump_named_object(i4);
hexdump_named_object(*i4);
}
```

## Segment names

`i1`

, a modifiable global object, is in the**data**segment`i2`

, a non-modifiable global object, is in the**code**/**text**/**readonly**segment`i3`

and`i4`

, local variable objects, are in the**stack**segment`*i4`

, a dynamically-allocated object, is in the**heap**segment