Overview
Last time, we examined memory broadly, dividing it into regions called segments that hold different kinds of objects. This time, we consider memory in a more fine-grained way, investigating type sizes, alignments, and layout rules.
Primitive types
C++’s primitive values include integers, floating point numbers, and pointers.
This program, sizes.cc
, prints out some of their values. What do you see?
#include <cstdio>
#include "hexdump.hh"
int main() {
char c1 = 61;
int i1 = 61;
float f1 = 61;
int* p1 = &i1;
printf("c1: %d\n", (int) c1);
printf("i1: %d\n", i1);
printf("f1: %g\n", f1);
printf("p1: %p\n", p1);
hexdump_object(c1);
hexdump_object(i1);
hexdump_object(f1);
hexdump_object(p1);
}
Primitive type observations
char
is a kind of integerchar
is also special in C and C++: any object in memory can be observed as an array ofchar
1, any many objects can be manipulated as if they are arrays ofchar
- We call a
char
a byte hexdump
prints the contents of memory by treating it as an array ofchar
s- In modern implementations, a
char
holds exactly 8 bits
int
is a kind of integer- When
char
andint
variables hold the same values, their representations look similar int
takes more memory to represent
- When
float
is a kind of floating-point number- Same-valued
float
andint
have very different representations
- Same-valued
int*
is a kind of pointer- Takes yet more memory to represent
- Value looks suspiciously like an address
Sizes and sizeof
- Every C++ object has a size
- Every object with the same type has the same size
- The size of an object is the number of bytes required to store the object
- For instance, memory returned by
malloc(SIZE)
can hold an object of sizeSIZE
- For instance, memory returned by
- The C++
sizeof
operator returns the size of an object
Sizes of primitive types
Type |
Size |
---|---|
|
1 |
|
2 |
|
4 |
|
8 |
|
8 |
|
4 |
|
8 |
|
16 |
|
8 |
Abstract machine and hardware machine
- Programs are written with reference to a language standard
- The language standard is meant to define exactly how every program behaves when executed
- In some programming languages, such as Python and Java, the language
standard is opaque
- The standard completely defines the meaning of every program
- The same program should behave identically on any hardware
- The C and C++ standards are translucent
- The standard partially defines the meaning of a program
- Some aspects of a program can behave differently on different hardware
- Example: Sizes of primitive types
- Standard imposes some requirements, compiler can make choices accordingly
Standard sizes of primitive types
Type |
x86-64 Linux size |
Standard size |
---|---|---|
|
1 |
1 |
|
2 |
≥1 |
|
4 |
≥ |
|
8 |
≥ |
|
8 |
≥ |
|
4 |
≥4 (probably) |
|
8 |
≥ |
|
16 |
≥ |
|
8 |
N/A |
Variant types
- Integer types come in signed and unsigned varieties
- Signed types have values in the range [2−(B−1), 2B−1−1], where B =
sizeof(T)
*8 - Unsigned types have values in the range [0, 2B−1]
- Positive signed numbers have the same representation as the corresponding unsigned numbers
signed T
andunsigned T
always have the same size- Special case:
char
might be either signed or unsigned (on x86-64 it is signed); if you definitely want a signed version, saysigned char
- Signed types have values in the range [2−(B−1), 2B−1−1], where B =
- Types can be qualified as
const
orvolatile
- Same data representation
Objects in memory
- Every object occupies a contiguous range of memory
- Objects that might exist at the same time occupy disjoint ranges of memory
- Live objects never overlap
Compound types (collections)
- C++ offers several ways to construct compound objects from simpler ones
- Compound objects are objects and have sizes
- Compound objects are laid out in memory according to standard rules
Compound type example
#include <cstdio>
#include "hexdump.hh"
int main() {
int a[2] = {61, 62};
union {
int a;
int b;
char c;
char d;
} u;
u.a = 61;
struct {
int a;
int b;
char c;
char d;
} s = {61, 62, 63, 64};
hexdump_object(a);
hexdump_object(u);
hexdump_object(s);
}
Array rule
- Array members are laid out sequentially in memory with no gaps
- Given
T a[N]
, and 0≤I
<N
:
a[I]
) = addressof(a
) + I
*sizeof(T)
sizeof(a)
= N
*sizeof(T)
Union rule
- Union types define objects that might have one of several different underlying representations
- One active member at a time, defined by most recent assignment
- Useful for special cases
- Given
union { T1 m1; T2 m2; … TN mN; } u
:
u.mI
) = addressof(u
)sizeof(u)
= maxI
sizeof(TI)
Struct rule (?)
- Struct members are laid out sequentially in memory with no gaps
- Given
struct { T1 m1; T2 m2; … TN mN; } s
:
s.c1
) = addressof(s
)I>1
, addressof(s.mI
) = addressof(s.mI-1
) + sizeof(TI-1)
(?)sizeof(s)
= ∑I
sizeof(TI)
(?)Alignment
- Hardware and compilers can impose restrictions on the addresses at which
certain types of object can appear
- The address of any
int
is a multiple of 4 (on x86-64) - The address of any pointer is a multiple of 8
- The address of any
- This is called alignment
- The C++
alignof
operator returns the alignment of a type or object- All objects of the same type have the same alignment
alignof(std::max_align_t)
is the maximum alignment for any type (16 on x86-64)
Alignments of primitive types
Type |
Alignment |
---|---|
|
1 |
|
2 |
|
4 |
|
8 |
|
8 |
|
4 |
|
8 |
|
16 |
|
8 |
- Except for
alignof(char)
, these values can change- On some x86-64 operating systems and compilers,
alignof(long)
= 4! - On x86-32 Linux,
alignof(double)
= 4
- On some x86-64 operating systems and compilers,
Alignments of compound types
- The members of a compound object must obey alignment restrictions
- This imposes alignment requirements on the compound type
Struct rule
Struct members are laid out sequentially in memory with no gaps, obeying alignment restrictions
- This requires rounding up offsets to multiples of the alignment
-
See, for example, the C++ standard 6.8.1.2 [basic.types.general]: “the underlying bytes making up [any simple] object can be copied into an array of char, unsigned char, or std::byte (17.2.1). If the content of that array is copied back into the object, the object shall subsequently hold its original value.” ↩︎