Overview
We discuss branching and save/restore calling conventions in machine code.
Full lecture notes on assembly — Textbook readings
Computer programming as translation
- Your idea ⟶ Programming language
- Programming language ⟶ Machine code
State in programming languages
- An unbounded number of variables of arbitrary type
- Dynamic allocation
- Good access control within a running program
- Local variables in one function cannot be accessed from others
- A local variable in a block isn’t accessible outside that block
State in CPUs
- A small number of named registers of limited type
- Computation acts on registers
- A large primary memory accessed by numeric address
- Stores data too large for registers
- Limited access control within a running program
- Instructions have undifferentiated access to memory and registers
- Explicit save & restore often required
Control flow in programming languages
- Sequential execution by default (statements run one after the next)
- Structured control flow:
if
,while
,for
- Unstructured control flow:
goto
(but that’s “BAD”) - Functions
- Control flow (call, return) plus state (parameters, return value)
- Complex control flow (e.g. exceptions)
Control flow in CPUs
- Sequential execution by default
- Only unstructured control flow
- Unconditional branch: go to a specific instruction
- Conditional branch: go to a specific instruction if a condition holds
- Functions
- CPU defines control flow (call, return)
- Calling convention defines state (argument registers, return value register)
- Communication with operating systems (next unit)
Why learn about machine code translation?
- Understand tools
- Gain confidence for debugging
- Gain facility with translations in general
- Many translations use similar skills
Unconditional jumps in assembly
j
/jmp ADDR
j ADDR
j *%reg
: Jump to address stored in%reg
j *ADDR
: Jump to address stored inADDR
- In compiler-generated assembly:
j LABEL
Conditional jumps in assembly
je ADDR
: Jump toADDR
if equal- Continue to next instruction if not equal
- If equal to what????
Condition flags
- Arithmetic operations not only produce results, they also change a
special-purpose register called the flags register
%rflags
/%eflags
- Comprising many boolean flags
- Conditional jumps read flags from this register
- Data movement instructions leave flags unchanged
ZF
ZF
is the zero flag- Set iff result of computation is zero
jz
: Jump ifZF
(ZF==1
)je
: Jump ifZF
jnz
: Jump if!ZF
(ZF==0
)jne
: Jump if!ZF
ZF
example
- Desired: Jump to
.L2
if%rax == %rcx
Flag-only arithmetic
- Commonly find
cmp
andtest
instructions near conditional branches - These instructions perform arithmetic, but ignore the result except for flags
cmp SRC1, SRC2
: Set flags based onSRC2 - SRC1
test SRC1, SRC2
: Set flags based onSRC2 & SRC1
Other flags
SF
: Set iff result is negative (when considered as signed)CF
: Set iff computation overflowed when considered as unsignedOF
: Set iff computation overflowed when considered as signed
Comparison conditional jumps
- Usually associated with
cmp
ja
(jnbe
): Jump if greater, unsigned!ZF && !CF
jg
(jnle
): Jump if greater, signed!ZF && (SF == OF)
How to read a comparison
jCONDITION
: Jump if previous computation CONDITION zeroje
: Jump if previous computation equaled zeroja
: Jump if previous computation is greater than zero, unsignedjg
: Jump if previous computation is greater than zero, signed
- Arithmetic transformations help it make sense
cmp %edi, %esi; jg .L2
jg
means “jump if previous computation > 0, signed”- ⟶ “jump if
%esi - %edi > 0
, signed” - ⟶ “jump if
%esi > %edi
, signed”
Yet moar
jae
,jge
,jb
,jl
,jbe
,jle
js
,jns
,jo
,jno
,jc
,jnc
Control flow: f19.s
–f24.s
Conditional move instructions
cmovCONDITION SRC, DST
- Example:
cmovz SRC, DST
,cmovl SRC, DST
- Perform move only if condition holds
- Shorthand for
jnCONDITION 1f; mov SRC, DST; 1:
- Example:
Control flow: f25.s
–f26.s
Stacks and function-local storage
- Stacks grow down: callee storage (inner function) has lower addresses
- The return address, which is stored automatically by
callq
, marks the boundary between callee storage and caller storage - Natural storage for local temporaries
- Finite register set; often need to save intermediate values
Saving data on the stack
- Save:
subq $8, %rsp; movq REG, (%rsp)
pushq REG
- Restore:
movq (%rsp), REG; addq $8, %rsp
popq REG
Calling convention and save/restore
- A function must restore certain registers to their entry-time values before returning
- Example:
%rsp
,%rip
—but more too
- Example:
- Caller-saved registers
- If the caller (the “outer”, or calling, function) cares about the value, it must save that value and restore it after the callee returns
- Most registers
- Callee-saved registers
- If the callee (the “inner”, or called, function) uses the value, it must save that value first and restore it before returning
%rbx
,%r12
–%r15
,%rbp