Overview
We finish our discussion of assembly address modes and arithmetic instructions, then discuss calling conventions and control flow in machine code.
Full lecture notes on assembly — Textbook readings
Sidebar: Type-safe linkage and mangled names
- The name of a C++ function encodes the types of its arguments
- This makes C++ compilations safer and supports overloading (functions with different behavior based on argument types)
- Example:
f(int)⟶_Z1fi_Z: This is a mangled name1: Function name is 1 character longf: Actual function namei: First argument isint
- To demangle, try
c++filt MANGLEDNAME
Data operands and address modes
$X: an immediate value (a constant)%X: a register valuea(%rip): a global symbol(%X): an indirect reference (dereferencing a “pointer”)8(%X): an offset indirect reference (dereferencing a structure or array)N(R)means dereference memory at addressR+N
Arithmetic instructions
- General format:
OP SRC, DST- This means
DST := DST OP SRC
- This means
addl %eax, %ebx%ebx := %ebx + %eax
subl %rdi, %r9%r9 := %r9 - %rdi
f08.s
Arithmetic (computation) instructions
xorl %eax, %eaxmeans%eax := %eax ^ %eax- Which means…
f09.s, f10.s
f11.s
Moving into register slices
mov[SIGN][SRCSIZE][DSTSIZE]SIGNisz(extend with zeros) ors(extend with sign bit)SRCSIZE/DSTSIZEisb(byte),w(short),l(int), orq(long)
f12.s
f13.s, f14.s
f15.s, f16.s, f17.s
More data formats
(%X,%Y,Z): an array indirect reference- Dereference memory at
%X + %Y * %Z
- Dereference memory at
- Full format:
offset(base,index,scale)offset + base + index * scaleoffsetmust be a constantscalemust be 1, 2, 4, or 8- Default
offset,base, andindexare 0; defaultscaleis 1
f18.s
The lea instruction
leastands for Load Effective Address- It performs an address computation, but does not dereference
- Often used by compiler as a parsimonious alternative to array arithmetic
leal (%rdi,%rsi,8), %eaxmovl %esi, %eax; shll $3, %eax; addl %edi, %eax
Calling convention
- Some aspects of machine code are fixed by the processor manufacturer
- Intel decided
0xc3is the representation ofret
- Intel decided
- Some aspects of machine code are set by agreement among compiler and operating system developers
- Intel did not decide which register holds return values
- We call this agreement the calling convention since it governs function calls
- Think Geneva Convention, not Comic Convention
- Different conventions can exist for the same processor (e.g., Unix vs. Windows)
- Only codes with the same conventions can safely interact
Elements of a calling convention
- Function arguments
- Function return values
- Local variable storage
- Stack alignment
- Memory and processor state when the program begins
Let’s explore: cc01.cc–cc03.cc
Arguments
- Argument registers are
%rdi,%rsi,%rdx,%rcx,%r8,%r9, in that order - Large objects are passed in up to 2 registers if they fit, stack otherwise
Return values
- Return register is
%rax - Large objects are returned in
%rax+%rdxif they fit, otherwise first argument points to space for return value