Overview
Last time, we examined memory broadly, dividing it into regions called segments that hold different kinds of objects. This time, we consider memory in a more fine-grained way, investigating type sizes, alignments, and layout rules.
Primitive types
C++’s primitive values include integers, floating point numbers, and pointers.
This program, sizes.cc, prints out some of their values. What do you see?
#include <cstdio>
#include "hexdump.hh"
int main() {
char c1 = 61;
int i1 = 61;
float f1 = 61;
int* p1 = &i1;
printf("c1: %d\n", (int) c1);
printf("i1: %d\n", i1);
printf("f1: %g\n", f1);
printf("p1: %p\n", p1);
hexdump_object(c1);
hexdump_object(i1);
hexdump_object(f1);
hexdump_object(p1);
}
Primitive type observations
charis a kind of integercharis also special in C and C++: any object in memory can be observed as an array ofchar1, any many objects can be manipulated as if they are arrays ofchar- We call a
chara byte hexdumpprints the contents of memory by treating it as an array ofchars- In modern implementations, a
charholds exactly 8 bits
intis a kind of integer- When
charandintvariables hold the same values, their representations look similar inttakes more memory to represent
- When
floatis a kind of floating-point number- Same-valued
floatandinthave very different representations
- Same-valued
int*is a kind of pointer- Takes yet more memory to represent
- Value looks suspiciously like an address
Sizes and sizeof
- Every C++ object has a size
- Every object with the same type has the same size
- The size of an object is the number of bytes required to store the object
- For instance, memory returned by
malloc(SIZE)can hold an object of sizeSIZE
- For instance, memory returned by
- The C++
sizeofoperator returns the size of an object
Sizes of primitive types
Type |
Size |
|---|---|
|
1 |
|
2 |
|
4 |
|
8 |
|
8 |
|
4 |
|
8 |
|
16 |
|
8 |
Abstract machine and hardware machine
- Programs are written with reference to a language standard
- The language standard is meant to define exactly how every program behaves when executed
- In some programming languages, such as Python and Java, the language
standard is opaque
- The standard completely defines the meaning of every program
- The same program should behave identically on any hardware
- The C and C++ standards are translucent
- The standard partially defines the meaning of a program
- Some aspects of a program can behave differently on different hardware
- Example: Sizes of primitive types
- Standard imposes some requirements, compiler can make choices accordingly
Standard sizes of primitive types
Type |
x86-64 Linux size |
Standard size |
|---|---|---|
|
1 |
1 |
|
2 |
≥1 |
|
4 |
≥ |
|
8 |
≥ |
|
8 |
≥ |
|
4 |
≥4 (probably) |
|
8 |
≥ |
|
16 |
≥ |
|
8 |
N/A |
Variant types
- Integer types come in signed and unsigned varieties
- Signed types have values in the range [2−(B−1), 2B−1−1], where B =
sizeof(T)*8 - Unsigned types have values in the range [0, 2B−1]
- Positive signed numbers have the same representation as the corresponding unsigned numbers
signed Tandunsigned Talways have the same size- Special case:
charmight be either signed or unsigned (on x86-64 it is signed); if you definitely want a signed version, saysigned char
- Signed types have values in the range [2−(B−1), 2B−1−1], where B =
- Types can be qualified as
constorvolatile- Same data representation
Objects in memory
- Every object occupies a contiguous range of memory
- Objects that might exist at the same time occupy disjoint ranges of memory
- Live objects never overlap
Compound types (collections)
- C++ offers several ways to construct compound objects from simpler ones
- Compound objects are objects and have sizes
- Compound objects are laid out in memory according to standard rules
Compound type example
#include <cstdio>
#include "hexdump.hh"
int main() {
int a[2] = {61, 62};
union {
int a;
int b;
char c;
char d;
} u;
u.a = 61;
struct {
int a;
int b;
char c;
char d;
} s = {61, 62, 63, 64};
hexdump_object(a);
hexdump_object(u);
hexdump_object(s);
}
Array rule
- Array members are laid out sequentially in memory with no gaps
- Given
T a[N], and 0≤I<N:
a[I]) = addressof(a) + I*sizeof(T)sizeof(a) = N*sizeof(T)Union rule
- Union types define objects that might have one of several different underlying representations
- One active member at a time, defined by most recent assignment
- Useful for special cases
- Given
union { T1 m1; T2 m2; … TN mN; } u:
u.mI) = addressof(u)sizeof(u) = maxI sizeof(TI)Struct rule (?)
- Struct members are laid out sequentially in memory with no gaps
- Given
struct { T1 m1; T2 m2; … TN mN; } s:
s.c1) = addressof(s)I>1, addressof(s.mI) = addressof(s.mI-1) + sizeof(TI-1) (?)sizeof(s) = ∑I sizeof(TI) (?)Alignment
- Hardware and compilers can impose restrictions on the addresses at which
certain types of object can appear
- The address of any
intis a multiple of 4 (on x86-64) - The address of any pointer is a multiple of 8
- The address of any
- This is called alignment
- The C++
alignofoperator returns the alignment of a type or object- All objects of the same type have the same alignment
alignof(std::max_align_t)is the maximum alignment for any type (16 on x86-64)
Alignments of primitive types
Type |
Alignment |
|---|---|
|
1 |
|
2 |
|
4 |
|
8 |
|
8 |
|
4 |
|
8 |
|
16 |
|
8 |
- Except for
alignof(char), these values can change- On some x86-64 operating systems and compilers,
alignof(long)= 4! - On x86-32 Linux,
alignof(double)= 4
- On some x86-64 operating systems and compilers,
Alignments of compound types
- The members of a compound object must obey alignment restrictions
- This imposes alignment requirements on the compound type
Struct rule
Struct members are laid out sequentially in memory with no gaps, obeying alignment restrictions
- This requires rounding up offsets to multiples of the alignment
-
See, for example, the C++ standard 6.8.1.2 [basic.types.general]: “the underlying bytes making up [any simple] object can be copied into an array of char, unsigned char, or std::byte (17.2.1). If the content of that array is copied back into the object, the object shall subsequently hold its original value.” ↩︎