Assembly 1
Registers
x86-64 has 14 general-purpose registers and several special-purpose registers. This table gives all the basic registers, with special-purpose registers highlighted in yellow. You’ll notice different naming conventions, a side effect of the long history of the x86 architecture (the 8086 was first released in 1978).
Register |
32-bit |
16-bit |
8-bit low |
8-bit high |
Calling convention/Name |
Callee-saved? |
---|---|---|---|---|---|---|
%rax |
%eax |
%ax |
%al |
%ah |
Return value |
No |
%rbx |
%ebx |
%bx |
%bl |
%bh |
– |
Yes |
%rcx |
%ecx |
%cx |
%cl |
%ch |
4th parameter |
No |
%rdx |
%edx |
%dx |
%dl |
%dh |
3rd parameter |
No |
%rsi |
%esi |
%si |
%sil |
– |
2nd parameter |
No |
%rdi |
%edi |
%di |
%dil |
– |
1st parameter |
No |
%rbp |
%ebp |
%bp |
%bpl |
– |
Base pointer |
Yes |
%rsp |
%esp |
%sp |
%spl |
– |
Stack pointer |
Yes |
%r8 |
%r8d |
%r8w |
%r8b |
– |
5th parameter |
No |
%r9 |
%r9d |
%r9w |
%r9b |
– |
6th parameter |
No |
%r10 |
%r10d |
%r10w |
%r10b |
– |
– |
No |
%r11 |
%r11d |
%r11w |
%r11b |
– |
– |
No |
%r12 |
%r12d |
%r12w |
%r12b |
– |
– |
Yes |
%r13 |
%r13d |
%r13w |
%r13b |
– |
– |
Yes |
%r14 |
%r14d |
%r14w |
%r14b |
– |
– |
Yes |
%r15 |
%r15d |
%r15w |
%r15b |
– |
– |
Yes |
%rip |
%eip |
%ip |
– |
– |
Instruction pointer |
* |
%rflags |
%eflags |
%flags |
– |
– |
Flags and condition codes |
No |
The %rbp register has a special purpose: it points to the bottom of the current function’s stack frame, and local variables are often accessed relative to its value. However, when optimization is on, the compiler may determine that all local variables can be stored in registers. This frees up %rbp for use as another general-purpose register.
The relationship between different register bit widths is a little weird.
- Loading a value into a 32-bit register name sets the upper 32 bits
of the register to zero. Thus, after
movl \$-1, %eax
, the%rax
register has value 0x00000000FFFFFFFF. - Loading a value into a 16- or 8-bit register name leaves all other bits unchanged.
There are special instructions for loading signed and unsigned 8-, 16-,
and 32-bit quantities into registers, recognizable by instruction
suffixes. For instance, movzbl
moves an 8-bit quantity (a byte)
into a 32-bit register (a longword) with zero extension;
movslq
moves a 32-bit quantity (longword) into a 64-bit register
(quadword) with sign extension. There’s no need for movzlq
(why?).
Instruction format
The basic kinds of assembly instructions are:
- Computation. These instructions perform computation on values
stored in registers. Most have zero or one source registers and
one source/destination register. The source register, if any,
comes first. For example, the instruction
addq %rax, %rbx
performs the computation%rbx := %rbx + %rax
. - Data movement. These instructions move data between registers and memory. Almost all have one source register and one destination register; the source register comes first.
- Control flow. Normally the CPU executes instructions in sequence. Control flow instructions change the instruction pointer in other ways. There are unconditional branches (the instruction pointer is set to a new value), conditional branches (the instruction pointer is set to a new value if a condition is true), and function call and return instructions.
(We use the “AT&T syntax” for x86-64 assembly. For the “Intel syntax,” which you can find in online documentation from Intel, see the Aside in CS:APP3e §3.3, p177, or Wikipedia, or other online resources. AT&T syntax is distinguished by several features, but especially by the use of percent signs for registers. Sadly, the Intel syntax puts destination registers before source registers.)
Some instructions appear to combine computation and data movement. For
example, given the C code int\* ip; ... ++(\*ip);
the compiler might
generate incl (%rax)
rather than movl (%rax), %ebx; incl %ebx; movl %ebx, (%rax)
. However, such mixed instructions are actually
executed as distinct phases: data movement, then computation, then data
movement. This matters when we introduce parallelism.
Directives
Assembly code interleaves instructions with labels and directives.
Labels look like labelname:
or labelnumber:
; directives look
like .directivename arguments
. Labels are markers in the generated
assembly, used to compute addresses; we usually see them used in control
flow instructions. Directives are instructions to the assembler; for
instance, the .globl L
instruction says “label L
is globally
visible in the executable”, .align
sets the alignment of the
following data, .long
puts a number in the output, and .text
and
.data
define the current segment.
Stack
The architecture has special support for the stack segment thanks to the
push
, pop
, call
, and ret
instructions.
A push
instruction pushes a value onto the stack. This both modifies
the stack pointer (making it smaller) and modifies the stack segment (by
moving data there). For instance, the instruction pushq X
means:
subq $8, %rsp
movq X, (%rsp)
And popq X
undoes the effect of pushq X
. It means:
movq (%rsp), X
addq $8, %rsp
X
can be a register or a memory reference. Since pushing things onto
the stack causes the stack pointer’s address value to shrink, we often
say that the “stack grows down.”
callq ADDR
has the effect of these pseudo-instructions: pushq [NEXT INSTRUCTION POINTER]; j ADDR
.
retq
has the effect of this pseudo-instruction: popq %rip
.
Address modes
Instruction arguments use the following syntax. (See also CS:APP3e Figure 3.3 in §3.4.1, p181.)
Type | Example syntax | Value used |
---|---|---|
Register | %rbp |
Contents of %rbp |
Immediate | \$0x4 |
0x4 |
\$symbol_name |
Address of global symbol_name |
|
Memory | 0x4 |
Value at address 0x4 |
symbol_name |
Value at address of global symbol_name |
|
(%rax) |
Value located at address in %rax |
|
0x4(%rax) |
Value located at address %rax + 4 |
|
(%rax,%rbx) |
Value located at address %rax + %rbx |
|
(%rax,%rbx,4) |
Value located at address %rax + %rbx\*4 |
|
0x18(%rax,%rbx,4) |
Value located at address %rax + 0x18 + %rbx\*4 |
In the full form 0x18(%rax,%rbx,4)
, %rax
is the base, 0x18
the offset, %rbx
the index, and 4
the scale. The scale can
be either 1, 2, 4, or 8.
x86-64 code often refers to globals using %rip-relative addressing;
you might see a(%rip)
instead of just plain a
. This style of
global reference supports features like position-independent code (e.g.,
shared libraries) and address space layout randomization (improves
security).
Conditional branches
Arithmetic instructions change part of the %rflags
register as a
side effect of their operation. The most often used flags are:
- ZF (zero flag): set iff the result was zero.
- SF (sign flag): set iff the most significant bit (the sign bit) of the result was one (i.e., the result was negative if considered as a signed integer).
- CF (carry flag): set iff the result overflowed when considered as unsigned (i.e., the result was greater than 2W-1).
- OF (overflow flag): set iff the result overflowed when considered as signed (i.e., the result was greater than 2W-1-1 or less than –2W-1).
Although some instructions let you load specific flags into registers
(e.g., setz
; see CS:APP3e §3.6.2, p203), code more often accesses
them via conditional jump or conditional move instructions.
Instruction | Mnemonic | C example | Flags |
---|---|---|---|
j (jmp) | Jump | break; |
(Unconditional) |
je (jz) | Jump if equal (zero) | if (x == y) |
ZF |
jne (jnz) | Jump if not equal (nonzero) | if (x != y) |
!ZF |
jg (jnle) | Jump if greater | if (x \> y) , signed |
!ZF && !(SF ^ OF) |
jge (jnl) | Jump if greater or equal | if (x \>= y) , signed |
!(SF ^ OF) |
jl (jnge) | Jump if less | if (x \< y) , signed |
SF ^ OF |
jle (jng) | Jump if less or equal | if (x \<= y) , signed |
(SF ^ OF) |
ja (jnbe) | Jump if above | if (x \> y) , unsigned |
!CF && !ZF |
jae (jnb) | Jump if above or equal | if (x \>= y) , unsigned |
!CF |
jb (jnae) | Jump if below | if (x \< y) , unsigned |
CF |
jbe (jna) | Jump if below or equal | if (x \<= y) , unsigned |
CF |
js | Jump if sign bit | if (x \< 0) , signed |
SF |
jns | Jump if not sign bit | if (x \>= 0) , signed |
!SF |
The test
and cmp
instructions are frequently seen before a
conditional branch. These operations perform arithmetic but throw away
the result, except for condition codes. test
performs binary-and,
cmp
performs subtraction. cmp
is hard to grasp: remember that
subq %rax, %rbx
performs %rbx := %rbx - %rax
—the
source/destination operand is on the left. So cmpq %rax, %rbx
evaluates %rbx - %rax
. The sequence cmpq %rax, %rbx; jg L
will
jump to label L
if %rbx
is greater than %rax
(signed).