Registers
Registers are the fastest kind of memory available in the machine. x86-64 has 14 general-purpose registers and several special-purpose registers. This table gives all the basic registers, with special-purpose registers highlighted in yellow. You’ll notice different naming conventions, a side effect of the long history of the x86 architecture (the 8086 was first released in 1978).
Full register name |
32-bit |
16-bit |
8-bit low |
8-bit high |
Use in calling convention |
Callee-saved? |
---|---|---|---|---|---|---|
General-purpose registers: | ||||||
%rax |
%eax |
%ax |
%al |
%ah |
Return value (accumulator) |
No |
%rbx |
%ebx |
%bx |
%bl |
%bh |
– |
Yes |
%rcx |
%ecx |
%cx |
%cl |
%ch |
4th function argument |
No |
%rdx |
%edx |
%dx |
%dl |
%dh |
3rd function argument |
No |
%rsi |
%esi |
%si |
%sil |
– |
2nd function argument |
No |
%rdi |
%edi |
%di |
%dil |
– |
1st function argument |
No |
%r8 |
%r8d |
%r8w |
%r8b |
– |
5th function argument |
No |
%r9 |
%r9d |
%r9w |
%r9b |
– |
6th function argument |
No |
%r10 |
%r10d |
%r10w |
%r10b |
– |
– |
No |
%r11 |
%r11d |
%r11w |
%r11b |
– |
– |
No |
%r12 |
%r12d |
%r12w |
%r12b |
– |
– |
Yes |
%r13 |
%r13d |
%r13w |
%r13b |
– |
– |
Yes |
%r14 |
%r14d |
%r14w |
%r14b |
– |
– |
Yes |
%r15 |
%r15d |
%r15w |
%r15b |
– |
– |
Yes |
Special-purpose registers: | ||||||
%rsp |
%esp |
%sp |
%spl |
– |
Stack pointer |
Yes |
%rbp |
%ebp |
%bp |
%bpl |
– |
Base pointer |
Yes |
%rip |
%eip |
%ip |
– |
– |
Instruction pointer |
* |
%rflags |
%eflags |
%flags |
– |
– |
Flags and condition codes |
No |
Note that unlike primary memory (which is what we think of when we discuss
memory in a C/C++ program), registers have no addresses! There is no address
value that, if cast to a pointer and dereferenced, would return the contents
of the %rax
register. Registers live in a separate world from the memory
whose contents are partially prescribed by the C abstract machine.
The %rbp register has a special purpose: it points to the bottom of the current function’s stack frame, and local variables are often accessed relative to its value. However, when optimization is on, the compiler may determine that all local variables can be stored in registers. This frees up %rbp for use as another general-purpose register.
The relationship between different register bit widths is a little weird.
- Loading a value into a 32-bit register name sets the upper 32 bits
of the register to zero. Thus, after
movl $-1, %eax
, the%rax
register has value 0x00000000FFFFFFFF. - Loading a value into a 16- or 8-bit register name leaves all other bits unchanged.
There are special instructions for loading signed and unsigned 8-, 16-,
and 32-bit quantities into registers, recognizable by instruction
suffixes. For instance, movzbl
moves an 8-bit quantity (a byte)
into a 32-bit register (a longword) with zero extension;
movslq
moves a 32-bit quantity (longword) into a 64-bit register
(quadword) with sign extension. There’s no need for movzlq
(why?).
Instruction format
The basic kinds of assembly instructions are:
- Computation. These instructions perform computation on values,
typically values stored in registers. Most have zero or one source
operands and one source/destination operand, with the source operand
coming first. For example, the instruction
addq %rax, %rbx
performs the computation%rbx := %rbx + %rax
. - Data movement. These instructions move data between registers and memory. Almost all have one source operand and one destination operand; the source operand comes first.
- Control flow. Normally the CPU executes instructions in sequence. Control flow instructions change the instruction pointer in other ways. There are unconditional branches (the instruction pointer is set to a new value), conditional branches (the instruction pointer is set to a new value if a condition is true), and function call and return instructions.
(We use the “AT&T syntax” for x86-64 assembly. For the “Intel syntax,” which you can find in online documentation from Intel, see the Aside in CS:APP3e §3.3, p177, or Wikipedia, or other online resources. AT&T syntax is distinguished by several features, but especially by the use of percent signs for registers. Sadly, the Intel syntax puts destination registers before source registers.)
Some instructions appear to combine computation and data movement. For
example, given the C code int* ip; ... ++(*ip);
the compiler might generate
incl (%rax)
rather than movl (%rax), %ebx; incl %ebx; movl %ebx, (%rax)
.
However, the processor actually divides these complex instructions into tiny,
simpler, invisible instructions called
microcode, because the simpler
instructions can be made to execute faster. The complex incl
instruction
actually runs in three phases: data movement, then computation, then data
movement. This matters when we introduce parallelism.
Directives
Assembly generated by a compiler contains instructions as well as labels and
directives. Labels look like labelname:
or labelnumber:
; directives look
like .directivename arguments
. Labels are markers in the generated assembly,
used to compute addresses. We usually see them used in control flow
instructions, as in jmp L3
(“jump to L3”). Directives are instructions to
the assembler; for instance, the .globl L
instruction says “label L
is
globally visible in the executable”, .align
sets the alignment of the
following data, .long
puts a number in the output, and .text
and .data
define the current segment.
We also frequently look at assembly that is disassembled from executable
instructions by GDB, objdump -d
, or objdump -S
. This output looks
different from compiler-generated assembly: in disassembled instructions,
there are no intermediate labels or directives. This is because the labels and directives disappear during the process of generating executable instructions.
For instance, here is some compiler-generated assembly:
.globl _Z1fiii
.type _Z1fiii, @function
_Z1fiii:
.LFB0:
cmpl %edx, %esi
je .L3
movl %esi, %eax
ret
.L3:
movl %edi, %eax
ret
.LFE0:
.size _Z1fiii, .-_Z1fiii
And a disassembly of the same function, from an object file:
0000000000000000 <_Z1fiii>:
0: 39 d6 cmp %edx,%esi
2: 74 03 je 7 <_Z1fiii+0x7>
4: 89 f0 mov %esi,%eax
6: c3 retq
7: 89 f8 mov %edi,%eax
9: c3 retq
Everything but the instructions is removed, and the helpful .L3
label has
been replaced with an actual address. The function appears to be located at
address 0. This is just a placeholder; the final address is assigned by the
linking process, when a final executable is created.
Finally, here is some disassembly from an executable:
0000000000400517 <_Z1fiii>:
400517: 39 d6 cmp %edx,%esi
400519: 74 03 je 40051e <_Z1fiii+0x7>
40051b: 89 f0 mov %esi,%eax
40051d: c3 retq
40051e: 89 f8 mov %edi,%eax
400520: c3 retq
The instructions are the same, but the addresses are different. (Other compiler flags would generate different addresses.)
Address modes
Most instruction operands use the following syntax for values. (See also CS:APP3e Figure 3.3 in §3.4.1, p181.)
Type | Example syntax | Value used |
---|---|---|
Register | %rbp |
Contents of %rbp |
Immediate | $0x4 |
0x4 |
Memory | 0x4 |
Value stored at address |
symbol_name |
Value stored in global symbol_name |
|
symbol_name(%rip) |
%rip -relative addressing for global (see below) |
|
(%rax) |
Value stored at address in %rax |
|
0x4(%rax) |
Value stored at address %rax + 4 |
|
(%rax,%rbx) |
Value stored at address %rax + %rbx |
|
(%rax,%rbx,4) |
Value stored at address %rax + %rbx*4 |
|
0x18(%rax,%rbx,4) |
Value stored at address %rax + 0x18 + %rbx*4 |
The full form of a memory operand is offset(base,index,scale)
, which refers
to the address offset + base + index*scale
. In 0x18(%rax,%rbx,4)
, %rax
is the base, 0x18
the offset, %rbx
the index, and 4
the scale. The
offset (if used) must be a constant and the base (if used) must be a register;
the scale must be either 1, 2, 4, or 8. The default offset, base, and index
are 0, and the default scale is 1.
symbol_name
s are found only in compiler-generated assembly; disassembly uses
raw addresses (0x601030
) or %rip
-relative offsets (0x200bf2(%rip)
).
Jumps and function call instructions use different syntax 🤷🏽♀️: *
, rather
than ()
, represents indirection.
Type | Example syntax | Address used |
---|---|---|
Register | *%rax |
Contents of %rax |
Immediate | .L3 |
Address of .L3 (compiler-generated assembly) |
400410 or 0x400410 |
Given address | |
Memory | *0x200b96(%rip) |
Value stored at address %rip + 0x200b96 |
*(%r12,%rbp,8) |
Other address modes accepted |
Address computations
The base(offset,index,scale)
form compactly expresses many array-style
address computations. It’s typically used with a mov
-type instruction to
dereference memory. However, the compiler can use that form to compute
addresses, thanks to the lea
(Load Effective Address) instruction.
For instance, in movl 0x18(%rax,%rbx,4), %ecx
, the address %rax + 0x18 +
%rbx*4
is computed, then immediately dereferenced: the 4-byte value located
there is loaded into %ecx
. In leaq 0x18(%rax,%rbx,4), %rcx
, the same address is computed, but it is not dereferenced. Instead, the computed address is moved into register %rcx
.
Thanks to lea
, the compiler will also prefer the base(offset,index,scale)
form over add
and mov
for certain computations on integers. For example,
this instruction:
leaq (%rax,%rbx,2), %rcx
performs the function %rcx := %rax + 2 * %rbx
, but in one instruction,
rather than three (movq %rax, %rcx; addq %rbx, %rcx; addq %rbx, %rcx
).
%rip
-relative addressing
x86-64 code often refers to globals using %rip-relative addressing: a
global variable named a
is referenced as a(%rip)
rather than a
.
This style of reference supports position-independent code (PIC), a security feature. It specifically supports position-independent executables (PIEs), which are programs that work independently of where their code is loaded into memory.
To run a conventional program, the operating system loads the program’s instructions into memory at a fixed address that’s the same every time, then starts executing the program at its first instruction. This works great, and runs the program in a predictable execution environment (the addresses of functions and global variables are the same every time). Unfortunately, the very predictability of this environment makes the program easier to attack.
In a position-independent executable, the operating system loads the program at varying locations: every time it runs, the program’s functions and global variables have different addresses. This makes the program harder to attack (though not impossible).
Program startup performance matters, so the operating system doesn’t recompile the program with different addresses each time. Instead, the compiler does most of the work in advance by using relative addressing.
When the operating system loads a PIE, it picks a starting point and loads all
instructions and globals relative to that starting point. The PIE’s
instructions never refer to global variables using direct addressing: you’ll
never see movl global_int, %eax
. Globals are referenced relatively
instead, using deltas relative to the next %rip
: movl global_int(%rip),
%eax
. These relative addresses work great independent of starting point! For
instance, consider an instruction located at (starting-point + 0x80) that
loads a variable g
located at (starting-point + 0x1000) into %rax
. In a
non-PIE, the instruction might be written movq 0x400080, %rax
(in compiler
output, movq g, %rax
); but this relies on g
having a fixed address. In a
PIE, the instruction might be written movq 0xf80(%rip), %rax
(in compiler
output, movq g(%rip), %rax
), which works out beautifully no matter the
starting point.
If the starting point is… | The instruction is at… | g is at… |
With delta… |
---|---|---|---|
0x400000 | 0x400080 | 0x401000 | 0xF80 |
0x404000 | 0x404080 | 0x405000 | 0xF80 |
0x4003F0 | 0x400470 | 0x4013F0 | 0xF80 |