Section 4: Assembly
Goals
- Gain comfort with x86 assembly
- Learn how to compute the same result in different ways
Background
Most of what you need to know for this section has already been discussed in lecture. You can find information on registers and their functions, addressing modes, and control flow instructions in Thursday's lecture notes.
You can find a more detailed list of common instructions and their syntax here.
Assignment
Pull the latest version of the sections repository. If you haven't cloned it yet, open terminal and enter
git clone <https://github.com/cs61/cs61-sections.git>
The cs61-sections/s04
directory contains a series of simple C
programs, which you can compile down to .s files by calling make
.
Your task is to write functionally equivalent programs in x86 assembly
to the default output that the compiler generates.
For example, consider the function that adds two 32 bit signed integers
(note, we've seen this in class already), which is in add.c
:
int add(int a, int b) {
return a + b;
}
In assembly, this could be written as:
add:
leal (%rdi, %rsi), %eax
ret
and in fact this is what the compiler generates, in add-1.s
.
However, a naive implementation might be:
add:
addl %edi,%esi
movq %rsi,%rax
retq
as in the hand-written add-2.s
, and both functions have the same
effect.
A third source file, add-3.s
, is not yet equivalent, but you can
make it so by inserting a single instruction. You can test whether your
change was successful with test-add-3
, which prints an ERROR if your
output doesn't match that of the baseline implementation. The included
makefile will generate a different test program for each .s file you
edit.
Once you've written 1 more version of add, we recommend you tackle the remaining functions by writing two more versions of each in the following order:
- udiv
- max
- factorial
- traverse
- swap
- bzero
Try to be creative with your modifications! While it's true (and important to understand why it's true) that you can often rename registers or add instructions with no side effects to retain the same program behavior, we're looking for less trivial changes. Here are some ideas on different ways to transform the assembly source:
- Remove redundancy from generated assembly: when the compiler does't
optimize, it fails to remove redundant assembly instructions (for
example, multiple
mov
instructions).
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
could be rewritten as
movq %rdi, %rax
- Use more specialized instructions that can have the same behavior in
certain contexts: we saw this with
leal
andaddl
above! - Use logical equivalences to "rephrase" a calculation or comparator: this is the strategy for add-3.s!
- Use different addressing modes to access the same memory location with different syntax
- Rewrite a recursive function in an iterative form. For example, a
function that recursively adds 1 to its argument in
%edi
until 10 could look like this in assembly:
cmpl $9, %edi
jg .L2
addl $1, %edi
call recurse
.L2:
movl %edi, %eax
leave
ret
and could appear as a loop adding one while the value is less than 10:
movl %edi, %eax
.LOOP:
cmpl $10, %eax
jge .L2
addl $1, %eax
jmp .LOOP
.L2
leave
ret