In today’s section, we’ll walk through a selection of our assembly
exercises, which are problems taken from prior exams. Also
take the opportunity to ask questions about the problem set and class
material.
Simultaneously enrolled college students are responsible for attending
section live (in-person or via zoom). If you cannot attend any section
live in a particular week, you may watch a recording (Canvas > Zoom > Cloud
Recordings) instead. Please self-report your live attendance (or that you watched
a recording) using this attendance form.
All exercises assume a 64-bit x86-64 architecture unless otherwise stated.
The main subjects of these exercises are:
- ASM-1: Assembly properties.
- ASM-6: Data structure assembly.
- ASM-16: Calling convention.
- ASM-23: Get-set interfaces.
ASM-1. Assembly properties
Here’s some assembly produced by compiling a C program. (The instructions are
labeled with byte offsets for your convenience.)
.globl f
.align 16, 0x90
.type f,@function
f:
0: movl $1, %r8d
6: jmp b <f+0xb>
8: incl %r8d
b: movl %r8d, %ecx
e: imull %ecx, %ecx
11: movl $1, %edx
16: movl %edx, %edi
18: imull %edi, %edi
1b: movl $1, %esi
20: movl %esi, %eax
22: imull %eax, %eax
25: addl %edi, %eax
27: cmpl %ecx, %eax
29: je 40 <f+0x40>
2b: cmpl %edx, %esi
2d: leal 1(%rsi), %eax
30: movl %eax, %esi
32: jl 20 <f+0x20>
34: cmpl %r8d, %edx
37: leal 1(%rdx), %eax
3a: movl %eax, %edx
3c: jl 16 <f+0x16>
3e: jmp 8 <f+0x8>
40: pushq %rax
41: movl $.L.str, %edi
46: xorl %eax, %eax
48: callq printf
4d: movl $1, %eax
52: popq %rcx
53: retq
.type .L.str,@object
.L.str:
.asciz "%d %d\n"
.size .L.str, 7
QUESTION ASM-1A. How many arguments might this function take? List all
that apply.
- 0
- 1
- 2
- 3 or more
All are valid answers! The function does not use any parameters, but
it might take parameters it doesn’t use.
We know that the function doesn’t use its parameters because none of the
parameter registers (%rdi
, %rsi
, %rdx
, %rcx
, %r9
, %r10
) are used
before they are defined, and the function doesn’t access its caller’s stack
space either (as the function might if it took arguments that were passed on
the stack). (Although %rdi
, %rsi
, %rdx
, and %rcx
are used in the
function—for instance, at offset b
—their entry-time values are not used:
each register is assigned to some value on its first use.)
QUESTION ASM-1B. What might this function return? List all that
apply.
- 0
- 1
- −1
- Its first argument, whatever that argument is
- A square number other than 0 or 1
- None of the above
It can only return 1. The only ret
instruction in the function is its last
instruction (at offset 53
), and the immediately preceding assignment to
%rax
sets it to the constant 1 (at offset 4d
).
QUESTION ASM-1C. Which callee-saved registers does this function
save and restore? List all that apply.
%rax
%rbx
%rcx
%rdx
%rbp
%rsi
%rdi
- None of the above
The callee-saved registers are %rbx
, %rbp
, %rsp
, and %r12
-%r15
.
The code does not modify any of these registers, so it doesn’t need to “save
and restore” them either—and it does not.
QUESTION ASM-1D. This function handles signed integers. If we
changed the C source to use unsigned integers instead, which
instructions would change? List all that apply.
movl
imull
addl
cmpl
je
jl
popq
- None of the above
jl
. The jl
instruction uses signed less-than; for unsigned less-than,
jb
would be used.
We accepted imull
or not! Although x86 imull
is signed, as used in C it
behaves equivalently to the nominally-unsigned mull
, and some compilers
use imull
for both kinds of integer. From the Intel manuals:
“[These] forms [of imul
] may also be used with unsigned operands
because the lower half of the product is the same regardless if the
operands are signed or unsigned. The CF and OF flags, however, cannot
be used to determine if the upper half of the result is non-zero.”
QUESTION ASM-1E. What might this function print? List all that
apply.
0 0
1 1
3 4
4 5
6 8
- None of the above
Choice #3 (3 4
) only. The function searches for a solution to
x
2
+ y
2
== z
2
, under the
constraint that x ≤ y
. When it finds one, it prints x
and y
and
then returns. It always starts from 1 1
and increments x
and y
one
at a time, so it can only print 3 4
.
ASM-6. Data structure assembly
Here are four assembly functions, f1
through f4
.
f1:
testl %esi, %esi
jle LBB0_3
incl %esi
LBB0_2:
movq 8(%rdi), %rdi
decl %esi
cmpl $1, %esi
jg LBB0_2
LBB0_3:
movl (%rdi), %eax
retq
f2:
pushq %rbp
movq %rsp, %rbp
movslq %esi, %rax
movq (%rdi,%rax,8), %rcx
movl (%rcx,%rax,4), %eax
popq %rbp
retq
f3:
testl %esi, %esi
jle LBB2_3
incl %esi
LBB2_2:
movl %edx, %eax
andl $1, %eax
movq 8(%rdi,%rax,8), %rdi
sarl %edx
decl %esi
cmpl $1, %esi
jg LBB2_2
LBB2_3:
movl (%rdi), %eax
retq
f4:
movslq %esi, %rax
movl (%rdi,%rax,4), %eax
retq
QUESTION ASM-6A. Each function returns a value loaded from some data
structure. Which function uses which data structure?
- Array
- Array of pointers to arrays
- Linked list
- Binary tree
The f4
function uses an array. f2
uses an array of pointers to arrays.
f1
uses a linked list, and f3
uses a binary tree.
The best way to answer a question like this is probably not to figure
out from first principles what each function corresponds to, but rather to
make an educated guess and confirm it with evidence.
For example, you might guess that the shortest function corresponds to the
simplest data structure, and the longest function to the most complex data
structure. That would give you f4
=array, f3
=binary tree—which is
correct! But making a guess isn’t enough; you should test the guess.
Does f4
make sense as dealing with an array? Yes! The (%rdi,%rax,4)
address format is classic for arrays; it corresponds to a load of the 4-byte
value (because of the destination register, %eax
) stored at %rdi + %rax * 4
. Since %rax
is initialized from (the least-significant byte of) %rsi
,
this function might have been written as:
int f4(int* array, signed char idx) {
return array[idx];
}
Does f3
make sense as dealing with a binary tree? Yes! First off, binary
trees are recursive data structures, and the function contains a loop.
Second, consider these instructions following LBB2_2
:
andl $1, %eax
movq 8(%rdi,%rax,8), %rdi
The first instruction computes %eax := %eax & 1
; after it, %rax
can be
only 0 or 1. The second instruction uses %rax
as an index into an array.
An array whose index always 0 or 1 is likely to be an array of two elements,
such as the children of a binary tree node!
More confirmation is available if you want it. The movq
instruction above
loads 8 bytes of memory from a block starting near %rdi
. Importantly, it
loads that memory into %rdi
itself. Given the rest of the loop, we can
tell that the %rdi
register appears to point to a recursive data structure
where each “node” contains at least 2 pointers to child “nodes”—a binary
tree.
struct treenode {
int value;
treenode* child[2];
};
int f3(treenode* n, int count, int path) {
while (count > 0) {
n = n->child[path & 1];
path >>= 1;
--count;
}
return n->value;
}
What about f1
and f2
? Using similar arguments, you can see that f1
is
the linked list (it involves a recursive data structure) and f2
the array
of pointers to arrays (its code features two array dereferences).
QUESTION ASM-6B. The array data structure is an array of type T.
Considering the code for the function that manipulates the array, which
of the following types are likely possibilities for T? List all that
apply.
char
int
unsigned long
unsigned long long
char*
- None of the above
All of the functions return 4-byte values, so only int
works.
ASM-16. Calling convention
The following questions concern valid C++ functions compiled using the
normal x86-64 calling convention. True or false?
QUESTION ASM-16A. If the function’s instructions do not save and restore
any registers, then the C++ function did not call any other function.
False for two reasons: (1) The compiler might determine that none of
function f
’s registers are needed after it calls another function, g
.
For instance, maybe it returns the result of calling g
with no further
computation. (2) Optimizations such as inlining (embedding one function’s
instructions inside another’s) and tail call elimination (transforming
recursive function calls into loops) can make function calls disappear from
generated assembly.
QUESTION ASM-16B. If the function’s instructions do not change the stack
pointer, then the function’s instructions do not contain a call
instruction.
This is true because of stack alignment. Recall that on function entry, the
stack pointer register’s value must be 8 bytes off a multiple of 16. Working
backwards, we can see that at the time of a call
instruction, the
stack pointer must equal a multiple of 16. (If it did not, then the
entry-time value would be wrong.) So, consider function f
that calls
another function g
. At entry to f
, the stack pointer was 8 bytes off a
multiple of 16; but just before the call g
instruction, the stack pointer
equalled a multiple of 16. Thus, if f
contains a callq
instruction,
it must also change the stack pointer first executing that instruction.
QUESTION ASM-16C. If the function’s instructions do not change the stack
pointer, then the C++ function did not call any other function. Explain
your answer briefly.
As in ASM-16A, this question is false because of tail call elimination and
other optimizations.
QUESTION ASM-16D. If the function’s instructions do not modify the %rax
register, then the C++ function must return void
.
False; the function could return the result of calling another function.
QUESTION ASM-16E. If the function’s instructions store a local variable
on the stack, then that variable’s address will be less than the
function’s initial stack pointer.
True because stacks grow down.
ASM-23. Get–set interfaces
The following questions concern get–set interfaces. These are
map-like data structures which support two operations:
void set(container, key, value)
Change container’s value for key to value.
value_type get(container, key)
Returns container’s current value for key. This is the most recent
value set by set(container, key, value) or, if there is no
previous set, the natural default for the value type
(0, nullptr
, empty string—whatever’s appropriate).
Here’s the assembly definition of xxx1
, which is either get or
set for a data structure.
_Z4xxx1...:
movq (%rdi,%rsi,8), %rax
ret
QUESTION ASM-23A. Is this get or set?
It is get. The function returns a value (it loads something into %rax
before returning), and does not modify memory. Both these observations make
sense for get
, but not for set
.
QUESTION ASM-23B. What kind of data structure is this?
It is an array, because the only memory instruction is an array dereference
into an array of 8-byte integers (or equivalent structures).
QUESTION ASM-23C. What can you tell about the key type for this data structure?
Keys to this data structure are integers, most likely unsigned integers,
probably 4 or 8 bytes.
We say “likely unsigned” because it would be weird for the array argument
(%rdi
, the first parameter register) to accept negative indexes (via
%rsi
, the second parameter register), but it is possible to design a
container interface allowing negative indexes.
QUESTION ASM-23D. What can you tell about the value type?
They take 8 bytes to represent, because a full 8 bytes are loaded from the
array data structure into %rax
before the function returns.
QUESTION ASM-23E. Write a C++ version of the other operation for the
same data structure, including a function signature and any type definitions
required.
void set(uintptr_t* a, uintptr_t k, uintptr_t v) {
a[k] = v;
}
Here’s the assembly definition of get2
, which is get for a different
data structure.
_Z4get2...:
.L28:
testq %rdi, %rdi
je .L30
movq (%rdi), %rax
cmpq %rsi, %rax
je .L35
cmpq %rax, %rsi
jnb .L27
movq 16(%rdi), %rdi
jmp .L28
.L27:
movq 24(%rdi), %rdi
jmp .L28
.L30:
xorl %eax, %eax
ret
.L35:
movq 8(%rdi), %rax
ret
QUESTION ASM-23F. What value is returned if the key parameter is not
present in the data structure?
Zero (or a null pointer).
There are two return paths, one after .L30
and one after .L35
. The
.L30
path returns zero (it clears %rax
via the xorl %eax, %eax
trick);
the .L35
path returns a value loaded from memory (8(%rdi)
). Given
get’s specification, the .L35
path represents a value whose key is
present in the data structure, and the other path must represent the default
value.
QUESTION ASM-23G. Write a struct
definition for the data structure being
accessed by this code.
The first argument to get2
is the container type, so let’s consider the
properties of the memory stored starting at %rdi
.
- The eight-byte value at
(%rdi)
is compared against %rsi
, the second
argument, which is a key. We can therefore assume the first member of
the data structure is an eight-byte key.
- The eight-byte value at
8(%rdi)
is returned in one return path. Since
get functions return values, we can assume the second member of the
data structure is an eight-byte value.
- The eight-byte value at
16(%rdi)
is loaded into %rdi
in one remaining
path, while the eight-byte value at 24(%rdi)
is loaded into %rdi
in
the other. In each case, the code loops back to .L28
, the start of the
function, which must continue searching for the key. The third and fourth
members of the data structure must be pointers to additional nodes in this
data structure, which must be recursive.
This gives us:
struct container_node {
unsigned long key; // or other eight-byte type
unsigned long value;
container_node* left;
container_node* right;
};
QUESTION ASM-23H. What kind of data structure is being accessed by this code?
A binary tree.