Ancient CS 61 Content Warning!!!!!1!!!
This is not the current version of the class.
This site was automatically translated from a wiki. The translation may have introduced mistakes (and the content might have been wrong to begin with).

Assembly 2

Compilers and linkers

The process of changing C code into executable instructions has multiple stages, and involves many file formats (some with subtle differences) and many interesting programs working together.

Overall, the process works like this:

Input file Translator Output file Purpose
.c Compiler .s Generate optimized instructions for functions
.s Assembler .o Turn instructions into bytes and relocation entries
.o (+ .a, .so) Linker executable file (no suffix on unix, .exe on windows) Combine functions, resolve addresses
executable Loader running program Load program into memory

In more detail:

Often intermediate stages in this process are hidden. For example, the compiler usually runs the assembler automatically; to generate an assembly file rather than an object file, you pass the compiler the -S option.

Examining objects

We can distinguish different kinds of assembly by looking at their formats. For example, here is a file x.c:

#include <stdio.h>
int my_global = 2;
int main(void) {
    return my_global;
}

Here’s a part of the corresponding x.s file, generated by the compiler (clang -O2 -S x.c)

main:                                   # @main
        .cfi_startproc
# BB#0:
        movl    my_global(%rip), %eax
        retq

Note that this file has no addresses at all: the assembler will assign temporary addresses and generate relocation entries.

We can assemble the file into an object file (clang -O2 -c x.s) and then disassemble it to examine the corresponding instructions. Here’s what objdump -d reports:

0000000000000000 

:

   0:  8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 6 <main+0x6>
   6:  c3                      retq   

Now there are addresses associated with each instruction (e.g., 0: and 6:), but those addresses are placeholders—when we actually create an executable, main will not be located at address 0. Also, note that the address of my_global has totally disappeared. The mov instruction accesses address 0x0(%rip). The 0x0 is a placeholder that will be filled in later by the linker. We can examine the corresponding relocation entries with objdump -r, which reports:

RELOCATION RECORDS FOR [.text]:
OFFSET           TYPE              VALUE 
0000000000000002 R_X86_64_PC32     my_global-0x0000000000000004

This tells the linker that result of computing my_global - 0x4 - 2 should be stored at offset 2 into the text segment—that is, it should be placed into the mov instruction.

We can then run the linker (clang -O2 x.o -o x). Here’s part of what objdump -d reports for the resulting executable x:

00000000004004e0 

:

  4004e0:  8b 05 4a 0b 20 00       mov    0x200b4a(%rip),%eax        # 601030 <my_global>

  4004e6:  c3                      retq   

A final address has been assigned to main and another has been assigned to my_global. The instructions have been moved to the correct place, so we see their correct addresses. And the correct offset has been inserted into the instruction stream so that the mov instruction refers to the address of my_global.

Calling convention

The calling convention is a set of rules that defines how functions interact. You can see the calling convention as a set of constraints. Different operating systems and architectures have different calling conventions, but all compilers for the same OS generally agree on a calling convention (this allows their object files can work together).

Every function must obey the calling convention. But how it obeys the calling convention can differ, depending on the function.

The basic conventions (for x86-64 Linux):

  • Return address: At function entry, the stack pointer %rsp points at the function’s return address.
  • Stack alignment: At function entry, the stack pointer must equal a multiple of 16 plus 8. That is, it must be 8 bytes off of 16-byte alignment. (Since the callq instruction modifies the stack by pushing the return address, this means that when callq is executed, %rsp must be truly 16-byte aligned.)
  • Parameters: At function entry, the function’s first 6 integer and pointer arguments are passed in registers %rdi, %rsi, %rdx, %rcx, %r8, and %r9, in that order. 4-byte and smaller values use the lower 32 bits of the corresponding registers; the upper 32 bits can be anything and must be ignored. The 7th and further parameters are stored in the stack, starting immediately after the return address, and with each parameter size rounded up to at least 8 bytes. Thus, in a function with 8 integer parameters, the 7th and 8th parameters are stored at 8(initial-%rsp) and 16(initial-%rsp), respectively.
  • Return value: At function exit, the function’s return value is in the %rax register. (4-byte and smaller values use the lower 32 bits, %eax; the upper 32 bits are ignored.)
  • Callee-saved registers: At function exit, the following registers must have the same values as they did at function entry: %rsp, %rbp, %rbx, %r12, %r13, %r14, %r15. The other registers can be arbitrarily modified.
  • Caller-saved registers: At function exit, caller-saved registers are not required to have the same values as they did at function entry. These registers are used to hold temporary values. If the caller wishes to preserve these values, they must be pushed onto the stack.
  • Stack usage: The function may add space to the stack for its own use; the initial %rsp marks a boundary between available space, which has smaller addresses than initial %rsp, and space reserved for the caller, which has larger addresses than the initial %rsp. The function must not access or modify caller-reserved space (larger addresses than the initial %rsp), with two exceptions: A function may access or modify its stack parameters (as when it has more than 6 arguments), and it may access or modify objects whose addresses are publicly visible (as when its caller passes it a pointer to a local variable). The function may reserve additional space by changing the current %rsp (e.g., by executing a push or subl \$56, %rsp), and it may use as scratch space the 128 bytes above the current %rsp (e.g., by storing a temporary at -8(%rsp)).

These conventions have some consequences. For example, if a function may modify %rbp, it will save the initial %rbp at function entry and restore it at function exit, often with instructions like:

pushq %rbp
...
popq %rbp
retq

A large function might run through the following stages.

  1. At entry, the function will pushq %rbp.
  2. Then it will push any other callee-saved registers it uses.
  3. Then it will allocate any additional required stack space with subq \$N, %rsp.
  4. Inside the function, local variables are referenced with names such as 8(%rsp). The positive offset is because %rsp points at the top of the stack, so it has the smallest address. However, simple functions may use scratch space for local variables (e.g., -8(%rsp)).
  5. At exit, the function will un-allocate its stack space with addq \$N, %rsp.
  6. Then it will pop any callee-saved registers pushed earlier, in reverse order.
  7. Then it will popq %rbp. At this point the initial %rsp has been restored.
  8. Then it will execute retq, which returns.

But these stages aren’t strictly required; only the conventions are required. So if a function doesn’t call another function, or doesn’t have any local variables, it may not execute subq \$N, %rsp. If a function doesn’t modify %rbp, it may not push the original %rbp. And so forth.

The full conventions go into far more detail, and explain how objects such as large structures are passed or returned. (Briefly, small structures, such as struct point { int x, y; }, are passed in one or more registers; large structures are passed on the stack.)

References