 # Assembly exercises

Exercises not as directly relevant to this year’s class are marked with ⚠️.

## ASM-1. Disassemble

Here’s some assembly produced by compiling a C program.

        .globl  f
.align  16, 0x90
.type   f,@function
f:
movl    $1, %r8d jmp .LBB0_1 .LBB0_6: incl %r8d .LBB0_1: movl %r8d, %ecx imull %ecx, %ecx movl$1, %edx
.LBB0_2:
movl    %edx, %edi
imull   %edi, %edi
movl    1, %esi .align 16, 0x90 .LBB0_3: movl %esi, %eax imull %eax, %eax addl %edi, %eax cmpl %ecx, %eax je .LBB0_7 cmpl %edx, %esi leal 1(%rsi), %eax movl %eax, %esi jl .LBB0_3 cmpl %r8d, %edx leal 1(%rdx), %eax movl %eax, %edx jl .LBB0_2 jmp .LBB0_6 .LBB0_7: pushq %rax .Ltmp0: movl.L.str, %edi
xorl    %eax, %eax
callq   printf
movl    $1, %eax popq %rcx retq .type .L.str,@object .L.str: .asciz "%d %d\n" .size .L.str, 7 QUESTION ASM-1A. How many arguments might this function have? Circle all that apply. 1. 0 2. 1 3. 2 4. 3 or more QUESTION ASM-1B. What might this function return? Circle all that apply. 1. 0 2. 1 3. −1 4. Its first argument, whatever that argument is 5. A square number other than 0 or 1 6. None of the above QUESTION ASM-1C. Which callee-saved registers does this function save and restore? Circle all that apply. 1. %rax 2. %rbx 3. %rcx 4. %rdx 5. %rbp 6. %rsi 7. %rdi 8. None of the above QUESTION ASM-1D. This function handles signed integers. If we changed the C source to use unsigned integers instead, which instructions would change? Circle all that apply. 1. movl 2. imull 3. addl 4. cmpl 5. je 6. jl 7. popq 8. None of the above QUESTION ASM-1E. What might this function print? Circle all that apply. 1. 0 0 2. 1 1 3. 3 4 4. 4 5 5. 6 8 6. None of the above ## ASM-2. Assembly Here is some x86 assembly code. f: movl a, %eax movl b, %edx andl$255, %edx
subl %edx, %eax
movl %eax, a
retq

QUESTION ASM-2A. Write valid C code that could have compiled into this assembly (i.e., write a C definition of function f), given the global variable declarations “extern unsigned a, b;.” Your C code should compile without warnings. REMINDER: You are not permitted to run a C compiler, except for the C compiler that is your brain.

QUESTION ASM-2B. Write different valid, warning-free C code that could have compiled into that assembly. This version should contain different operators than your first version. (For extra credit, use only one operator.)

QUESTION ASM-2C. Again, write different valid, warning-free C code that could have compiled into that assembly. In this version, f should have a different type than in your first version.

## ASM-3. Assembly and Data Structures

For each code sample below, indicate the most likely type of the data being accessed. (If multiple types are equally likely, just pick one.)

QUESTION ASM-3A. movzbl %al, %eax

QUESTION ASM-3B. movl -28(%rbp), %edx

QUESTION ASM-3C. movsbl -32(%rbp), %eax

QUESTION ASM-3D. movzwl -30(%rbp), %eax

For each code sample below, indicate the most likely data structure being accessed (assume that g_var is a global variable). Be as specific as possible.

QUESTION ASM-3E. movzwl 6(%rdx,%rax,8), %eax

QUESTION ASM-3F. movl (%rdx,%rax,4), %eax

QUESTION ASM-3G.

movzbl 4(%rax), %eax
movsbl %al, %eax

For the remaining questions, indicate for what values of the register contents will the jump be taken.

QUESTION ASM-3H.

xorl %eax, %eax
jge LABEL

QUESTION ASM-3I.

testb $1, %eax jne LABEL QUESTION ASM-3J. cmpl %edx, %eax jl LABEL ## ASM-4. Assembly language The next four questions pertain to the following four code samples. f1 f1: subq$8, %rsp
call    callfunc
movl    %eax, %edx
leal    1(%rax,%rax,2), %eax
testb   $1, %dl jne .L3 movl %edx, %eax shrl$31, %eax
sarl    %eax
.L3:
addq    $8, %rsp ret f2 f2: pushq %rbx xorl %ebx, %ebx .L3: movl %ebx, %edi addl$1, %ebx
call    callfunc
cmpl    $10, %ebx jne .L3 popq %rbx ret f3 f3: subq$8, %rsp
call    callfunc
subl    $97, %eax cmpb$4, %al
ja      .L2
movzbl  %al, %eax
jmp     *.L4(,%rax,8)
.L4:
.L3:
movl    $42, %edx jmp .L5 .L6: movl$4096, %edx
jmp     .L5
.L7:
movl    $52, %edx jmp .L5 .L8: movl$6440, %edx
jmp     .L5
.L2:
movl    $0, %edx jmp .L5 .L9: movl$61, %edx
.L5:
movl    $.LC0, %esi movl$1, %edi
movl    $0, %eax call __printf_chk addq$8, %rsp
ret
.LC0:
.string "Sum = %d\n"

f4

f4:
subq    $40, %rsp movl$1, (%rsp)
movl    $0, 16(%rsp) .L2: leaq 16(%rsp), %rsi movq %rsp, %rdi call callfunc movl 16(%rsp), %eax cmpl %eax, (%rsp) jne .L2 addq$40, %rsp
ret

Now answer the following questions. Pick the most likely sample; you will use each sample exactly once.

QUESTION ASM-4A. Which sample contains a for loop?

QUESTION ASM-4B. Which sample contains a switch statement?

QUESTION ASM-4C. Which sample contains only an if/else construct?

QUESTION ASM-4D. Which sample contains a while loop?

## ASM-5. Calling conventions: 6186

University Professor Helen Vendler is designing a poetic new processor, the 6186. Can you reverse-engineer some aspects of the 6186’s calling convention from its assembly?

Here’s a function:

int f(int* a, unsigned b) {
extern int g(int x);
int index = g(a[2*b + 1]);
return a[index + b];
}


And here’s that function compiled into 6186 instructions.

f:
sub $24, %rsp movq %ra, (%rsp) mov %rb, %rx shl$1, %rx
add $1, %rx movl (%ra, %rx, 4), %ra call g add %rb, %rr movq (%rsp), %ra movl (%ra, %rr, 4), %ra mov %ra, %rr add$24, %rsp
ret

6186 assembly syntax is based on x86-64 assembly, and like the x86-64, 6186 registers are 64 bits wide. However, the 6186 has a different set of registers. There are just five general-purpose registers, %ra, %rb, %rr, %rx, and %ry. (“[W]hen she tries to be deadly serious she is speaking under…constraint”.) The example also features the stack pointer register, %rsp.

Give brief explanations if unsure.

QUESTION ASM-5A. Which register holds function return values?

QUESTION ASM-5B. What is sizeof(int) on the 6186?

QUESTION ASM-5C. Which general-purpose register(s) must be callee-saved?

QUESTION ASM-5D. Which general-purpose register(s) must be caller-saved?

QUESTION ASM-5E. Which general-purpose register(s) might be callee-saved or caller-saved (you can’t tell which)?

QUESTION ASM-5F. Assuming the compiler makes function stack frames as small as possible given the calling convention, what is the alignment of stack frames?

QUESTION ASM-5G. Assuming that the 6186 supports the same addressing modes as the x86-64, write a single instruction that has the same effect on %ra as these three instructions:

shl $1, %rx add$1, %rx
movl (%ra,%rx,4), %ra

## ASM-6. Data structure assembly

Here are four assembly functions, f1 through f4.

f1:
pushq   %rbp
movq    %rsp, %rbp
testl   %esi, %esi
jle     LBB0_3
incl    %esi
LBB0_2:
movq    8(%rdi), %rdi
decl    %esi
cmpl    $1, %esi jg LBB0_2 LBB0_3: movl (%rdi), %eax popq %rbp retq f2: pushq %rbp movq %rsp, %rbp movslq %esi, %rax movq (%rdi,%rax,8), %rcx movl (%rcx,%rax,4), %eax popq %rbp retq f3: testl %esi, %esi jle LBB2_3 incl %esi LBB2_2: movl %edx, %eax andl$1, %eax
movq    8(%rdi,%rax,8), %rdi
sarl    %edx
decl    %esi
cmpl    $1, %esi jg LBB2_2 LBB2_3: movl (%rdi), %eax retq f4: movslq %esi, %rax movl (%rdi,%rax,4), %eax retq QUESTION ASM-6A. Each function returns a value loaded from some data structure. Which function uses which data structure? 1. Array 2. Array of pointers to arrays 3. Linked list 4. Binary tree QUESTION ASM-6B. The array data structure is an array of type T. Considering the code for the function that manipulates the array, which of the following types are likely possibilities for T? Circle all that apply. 1. char 2. int 3. unsigned long 4. unsigned long long 5. char* 6. None of the above ## ASM-7. Where’s Waldo? In the following questions, we give you C code and a portion of the assembly generated by some compiler for that code. (Sometimes we blank out a part of the assembly.) The C code contains a variable, constant, or function called waldo, and a point in the assembly is marked with asterisks ***. Your job is to find Waldo: write an assembly expression or constant that holds the value of waldo at the marked point. We’ve done the first one for you. NON-QUESTION: Where’s Waldo? int identity(int waldo) { return waldo; }  00000000004007f6 <identity>: 4007f6: 55 push %rbp 4007f7: 48 89 e5 mov %rsp,%rbp 4007fa: 89 7d fc mov %edi,-0x4(%rbp) 4007fd: 8b 45 fc mov -0x4(%rbp),%eax *** 400800: 5d pop %rbp 400801: c3 retq ANSWER: %edi, -0x4(%rbp), %eax, and %rax all hold the value of waldo at the marked point, so any of them is a valid answer. If the asterisks came before the first instruction, only %edi would work. QUESTION ASM-7A: Where’s Waldo? int f1(int a, int b, int waldo, int d) { if (a > b) { return waldo; } else { return d; } }  0000000000400802 <f1>: *** 400802: 55 push %rbp 400803: 48 89 e5 mov %rsp,%rbp 400806: 89 7d fc mov %edi,-0x4(%rbp) 400809: 89 75 f8 mov %esi,-0x8(%rbp) 40080c: 89 55 f4 mov %edx,-0xc(%rbp) 40080f: 89 4d f0 mov %ecx,-0x10(%rbp) 400812: 8b 45 fc mov -0x4(%rbp),%eax 400815: 3b 45 f8 cmp -0x8(%rbp),%eax 400818: 7e 05 jle 40081f <f1+0x1d> 40081a: 8b 45 f4 mov -0xc(%rbp),%eax 40081d: eb 03 jmp 400822 <f1+0x20> 40081f: 8b 45 f0 mov -0x10(%rbp),%eax 400822: 5d pop %rbp 400823: c3 retq QUESTION ASM-7B: Where’s Waldo? int int_array_get(int* a, int waldo) { int x = a[waldo]; return x; }  00000000004007d9 <int_array_get>: INSTRUCTIONS OMITTED *** 4007dc: 8b 04 b7 mov (%rdi,%rsi,4),%eax 4007df: c3 retq QUESTION ASM-7C: Where’s Waldo? int matrix_get(int** matrix, int row, int col) { int* waldo = matrix[row]; return waldo[col]; }  00000000004007e0 <matrix_get>: 4007e0: 48 63 f6 movslq %esi,%rsi 4007e3: 48 63 d2 movslq %edx,%rdx *** 4007e6: ?? ?? ?? ?? mov ??,%rax 4007ea: 8b 04 90 mov (%rax,%rdx,4),%eax 4007ed: c3 retq QUESTION ASM-7D: Where’s Waldo? int f5(int x) { extern int waldo(int); return waldo(x * 45); }  0000000000400be0 <f5>: *** 400be0: 6b ff 2d imul$0x2d,%edi,%edi
400be3:       eb eb                   jmp    400bd0

QUESTION ASM-7E: Where’s Waldo?

int factorial(int waldo) {
if (waldo < 2) {
return 1;
} else {
return waldo * factorial(waldo - 1);
}
}

0000000000400910 <factorial>:
400910:       83 ff 01                cmp    $0x1,%edi 400913: b8 01 00 00 00 mov$0x1,%eax
400918:       7e 13                   jle    .L2 <factorial+0x1b>
40091a:       [6 bytes of padding (a no-op instruction)]
.L1:          ***
400920:       0f af c7                imul   %edi,%eax
400923:       83 ef 01                sub    $0x1,%edi 400926: 83 ff 01 cmp$0x1,%edi
400929:       75 f5                   jne    .L1 <factorial+0x10>
.L2: 40092b:       f3 c3                   repz retq

QUESTION ASM-7F: Where’s Waldo?

⚠️ This question currently uses 32-bit assembly.

int binary_search(const char* needle, const char** haystack, unsigned sz) {
unsigned waldo = 0, r = sz;
while (waldo < r) {
unsigned m = waldo + ((r - waldo) >> 1);
if (strcmp(needle, haystack[m]) < 0) {
r = m;
} else if (strcmp(needle, haystack[m]) == 0) {
waldo = r = m;
} else {
waldo = m + 1;
}
}
return waldo;
}

80484ab <binary_search>:
INSTRUCTIONS OMITTED
.L1: 80484c3:       89 fe                   mov    %edi,%esi
80484c5:       29 de                   sub    %ebx,%esi
80484c7:       d1 ee                   shr    %esi
80484cb:       8b 44 b5 00             mov    0x0(%ebp,%esi,4),%eax
80484cf:       89 44 24 04             mov    %eax,0x4(%esp)
80484d3:       8b 44 24 30             mov    0x30(%esp),%eax
80484d7:       89 04 24                mov    %eax,(%esp)
80484da:       e8 11 fe ff ff          call   80482f0 <strcmp@plt>
80484df:       85 c0                   test   %eax,%eax
80484e1:       78 09                   js     .L2 <binary_search+0x41>
80484e3:       85 c0                   test   %eax,%eax
80484e5:       74 13                   je     80484fa <binary_search+0x4f>
***
80484e7:       8d 5e 01                lea    0x1(%esi),%ebx
80484ea:       eb 02                   jmp    .L3 <binary_search+0x43>
.L2: 80484ec:       89 f7                   mov    %esi,%edi
.L3: 80484ee:       39 df                   cmp    %ebx,%edi
80484f0:       77 d1                   ja     .L1 <binary_search+0x18>
INSTRUCTIONS OMITTED

In the remaining questions, you are given assembly compiled from one of the above functions by a different compiler, or at a different optimization level. Your goal is to figure out what C code corresponds to the given assembly.

QUESTION ASM-7G:

⚠️ This question currently uses 32-bit assembly.

804851d <waldo>:
804851d:       55                      push   %ebp
804851e:       89 e5                   mov    %esp,%ebp
8048520:       83 ec 18                sub    $0x18,%esp 8048523: 83 7d 08 01 cmpl$0x1,0x8(%ebp)
8048527:       7f 07                   jg     8048530
8048529:       b8 01 00 00 00          mov    $0x1,%eax 804852e: eb 10 jmp 8048540 8048530: 8b 45 08 mov 0x8(%ebp),%eax 8048533: 48 dec %eax 8048534: 89 04 24 mov %eax,(%esp) 8048537: e8 e1 ff ff ff call 804851d 804853c: 0f af 45 08 imul 0x8(%ebp),%eax 8048540: c9 leave 8048541: c3 ret What’s Waldo? Circle one. 1. f1 2. f5 3. matrix_get 4. permutation_compare 5. factorial 6. binary_search QUESTION ASM-7H: ⚠️ This question currently uses 32-bit assembly. 8048425 <waldo>: 8048425: 55 push %ebp 8048426: 89 e5 mov %esp,%ebp 8048428: 8b 45 08 mov 0x8(%ebp),%eax 804842b: 3b 45 0c cmp 0xc(%ebp),%eax 804842e: 7e 05 jle 8048435 <waldo+0x10> 8048430: 8b 45 10 mov 0x10(%ebp),%eax 8048433: eb 03 jmp 8048438 <waldo+0x13> 8048435: 8b 45 14 mov 0x14(%ebp),%eax 8048438: 5d pop %ebp 8048439: c3 ret  What’s Waldo? Circle one. 1. f1 2. f5 3. matrix_get 4. permutation_compare 5. factorial 6. binary_search QUESTION ASM-7I: 00000000004008b4 <waldo>: 4008b4: 55 push %rbp 4008b5: 48 89 e5 mov %rsp,%rbp 4008b8: 48 83 ec 10 sub$0x10,%rsp
4008bc:       89 7d fc                mov    %edi,-0x4(%rbp)
4008bf:       8b 45 fc                mov    -0x4(%rbp),%eax
4008c2:       6b c0 2d                imul   $0x2d,%eax,%eax 4008c5: 89 c7 mov %eax,%edi 4008c7: e8 9e 05 00 00 callq 400e6a 4008cc: c9 leaveq 4008cd: c3 retq  What’s Waldo? Circle one. 1. f1 2. f5 3. matrix_get 4. permutation_compare 5. factorial 6. binary_search ## ASM-8. (removed because redundant) ## ASM-9. Disassembly II The questions in this section concern a function called ensmallen, which has the following assembly. ensmallen: 1. movzbl (%rsi), %edx 2. testb %dl, %dl 3. movb %dl, (%rdi) 4. jne .L22 5. jmp .L23 6. .L18: 7. addq$1, %rsi
8.  .L22:
9.          movzbl  (%rsi), %eax
10.          cmpb    %dl, %al
11.          je      .L18
12.          addq    $1, %rdi 13. testb %al, %al 14. movb %al, (%rdi) 15. je .L23 16. movl %eax, %edx 17. jmp .L22 18. .L23: 19. retq QUESTION ASM-9A. How many arguments is this function likely to take? Give line numbers that helped you determine an answer. QUESTION ASM-9B. Are the argument(s) pointers? Give line numbers that helped you determine an answer. QUESTION ASM-9C. What type(s) are the argument(s) likely to have? Give line numbers that helped you determine an answer. QUESTION ASM-9D. Write a likely signature for the function. Use return type void. QUESTION ASM-9E. Write an alternate likely signature for the function, different from your last answer. Again, use return type void. QUESTION ASM-9F. Which callee-saved registers does this function use? Give line numbers that helped you determine an answer. QUESTION ASM-9G. The function has an “input” and an “output”. Give an “input” that would cause the CPU to jump from line 5 to label .L23, and describe what is placed in the “output” for that “input”. QUESTION ASM-9H. Give an “input” for which the corresponding “output” is not a copy of the “input”. Your answer must differ from the previous answer. QUESTION ASM-9I. Write C code corresponding to this function. Make it as compact as you can. ## ASM-10. Machine programming Intel really messed up this time. They’ve released a processor, the Fartium Core Trio, where every instruction is broken except the ones on this list. 1. cmpq %rdi, %rsi 2. decq %rsi 3. incq %rax 4. je L1 5. jl L2 6. jmp L3 7. movl (%rdi,%rax,4), %edi 8. retq 9. xchgq %rax, %rcx 10. xorq %rax, %rax (In case you forgot, xchgq swaps two values—here, the values in two registers—without modifying condition codes.) “So what if it’s buggy,” says Andy Grove; “it can still run programs.” For instance, he argues convincingly that this function: void do_nothing() { }  is implemented correctly by this Fartium instruction sequence: retq Your job is to implement more complex functions using only Fartium instructions. Your implementations must have the same semantics as the C functions, but may perform much worse than one might expect. You may leave off arguments and write instruction numbers (#1–10) or instruction names. Indicate where labels L1–L3 point (if you need them). Assume that the Fartium Core Trio uses the normal x86-64 calling convention. QUESTION ASM-10A. int return_zero() { return 0; }  QUESTION ASM-10B. int identity(int a) { return a; }  QUESTION ASM-10C. void infinite_loop() { while (1) { /* do nothing */ } }  QUESTION ASM-10D. struct point { int x; int y; int z; }; int extract_z(point* p) { return p->z; }  So much for the easy ones. Now complete one out of the following parts, or more than one for extra credit. QUESTION ASM-10E. long add(long a, long b) { return a + b; }  QUESTION ASM-10F. int array_dereference(int* a, long i) { return a[i]; }  ## ASM-11. Program Layout For the following questions, select the part(s) of memory from the list below that best describes where you will find the object. 1. heap 2. stack 3. between the heap and the stack 4. in a read-only data segment 5. in a text segment starting at address 0x08048000 6. in a read/write data segment 7. in a register Assume the following code, compiled without optimization. #include <errno.h> #include <getopt.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> // The following is copied from stdio.h for your reference #define EOF (-1) 1. unsigned long fib(unsigned long n) { 2. if (n < 2) { 3. return n; 4. } 5. return fib(n - 1) + fib(n - 2); 6. } 7. 8. int main(int argc, char *argv[]) { 9. extern int optind; 10. char ch; 11. unsigned long f, n; 12. 13. // Command line processing 14. while ((ch = getopt(argc, argv, "h")) != EOF) { 15. switch (ch) { 16. case 'h': 17. case '?': 18. default: 19. return (usage()); 20. } 21. } 22. 23. argc -= optind; 24. argv += optind; 25. 26. if (argc != 1) { 27. return usage(); 28. } 29. 30. n = strtoul(strdup(argv), nullptr, 10); 31. if (n == 0 && errno == EINVAL) { 32. return usage(); 33. } 34. 35. /* Now call one of the fib routines. */ 36. f = fib(n); 37. printf("fib(%lu) = %lu\n", n, f); 38. 39. return 0; 40. }  QUESTION ASM-11A. The string "fib(%lu) = %lu\n" (line 37). QUESTION ASM-11B. optind (line 23). QUESTION ASM-11C. When executing at line 5, where you will find the address to which fib returns. QUESTION ASM-11D. Where will you find the value of EOF that is compared to the return value of getopt in line 14. QUESTION ASM-11E. getopt (line 14) QUESTION ASM-11F. fib (lines 1-6) QUESTION ASM-11G. the variable f (line 36) QUESTION ASM-11H. the string being passed to strtoul (line 30) QUESTION ASM-11I. strdup (line 30) QUESTION ASM-11J. The value of the fib function when we return from fib (line 5). ## ASM-12. Assembly and Data Structures Consider the following assembly function. func: xorl %eax, %eax cmpb$0, (%rdi)
je      .L27
.L26:
addq    $1, %rdi addl$1, %eax
cmpb    $0, (%rdi) jne .L26 .L27: retq QUESTION ASM-12A. How many parameters does this function appear to have? QUESTION ASM-12B. What do you suppose the type of that parameter is? QUESTION ASM-12C. Write C++ code that corresponds to it. ## ASM-13. Assembly language Consider the following four assembly functions. # Code Sample 1 movq %rdi, %rax testq %rdx, %rdx je .L2 addq %rdi, %rdx movq %rdi, %rcx .L3: addq$1, %rcx
movb    %sil, -1(%rcx)
cmpq    %rdx, %rcx
jne     .L3
.L2:
rep ret
# Code Sample 2
movq    %rdi, %rax
testq   %rdx, %rdx
je      .L2
movq    %rdi, %rcx
.L3:
addq    $1, %rcx addq$1, %rsi
movzbl  -1(%rsi), %r8d
movb    %r8b, -1(%rcx)
cmpq    %rdx, %rcx
jne     .L3
.L2:
rep ret
# Code Sample 3
movb    (%rsi), %al
testb   %al, %al
je      .L3
incq    %rsi
.L2:
movb    %al, (%rdi)
incq    %rdi
movb    (%rsi), %al
incq    %rsi
testb   %al, %al
jne     .L2
.L3:
movq    %rdi, %rax
ret
# Code Sample 4
testq   %rdx, %rdx
je      .L3
movq    %rdi, %rax
.L2:
movb    %sil, (%rax)
incq    %rax
decq    %rdx
jne     .L2
.L3:
movq    %rdi, %rax
ret

(Note: The %sil register is the lowest-order byte of register %rsi, just as %al is the lowest-order byte of %rax and %r8b is the lowest-order byte of %r8.)

QUESTION ASM-13A. Which two of the assembly functions perform the exact same task?

QUESTION ASM-13B. What is that task? You can describe it briefly, or give the name of the corresponding C library function.

QUESTION ASM-13C. Explain how the other two functions differ from each other.

## ASM-14. Golden Baron

A very rich person has designed a new x86-64-based computer, the Golden Baron Supercomputer 9000, and is paying you handsomely to write a C compiler for it. There’s just one problem. This person, like many very rich people, is dumb, and on their computer, odd-numbered memory addresses don’t work for data. When data is loaded into a general-purpose register from an odd-numbered address, the value read is zero. For example, consider the following instructions:

movl 0x01020304, a(%rip) movl a(%rip), %eax (where the address of a is even). Executed on true x86-64, %rax will hold the value 0x01020304; on Golden Baron, %rax will hold 0x00020004. But it is still possible to write a correct C compiler for this ungodly hardware—you just have to work around the bad memory with code. You plan to use two bytes of Golden Baron memory for every one byte of normal x86-64 memory. For instance, an array int a = {1, 0x0a0b0c0d}; would be stored in 16 bytes of memory, like so: 01 00 00 00 00 00 00 00 0d 00 0c 00 0b 00 0a 00 Pointer arithmetic, and moving multi-byte values to and from registers, must account for the zero bytes that alternate with meaningful bytes. So to read the correct value for a, the compiler must arrange to read the bytes at addresses A+8, A+10, A+12, and A+14, where A is the address of the first byte in a. QUESTION ASM-14A. What should printf("%zu\n", sizeof(char)) print on Golden Baron? QUESTION ASM-14B. This function int f(signed char* c, size_t i) { return c[i]; }  can compile to two instructions on x86-64, including retq. It can also compile to two instructions on Golden Baron. (We’re assuming that memory used for Golden Baron instructions works normally.) What are those instructions? QUESTION ASM-14C. This function int g(int* a, size_t i) { return a[i]; }  can compile to two instructions on x86-64, but Golden Baron requires more work. Write the Golden Baron translation of this function in x86-64 assembly language. For partial credit, write C code that, executed on x86-64, would return the correct value from a Golden Baron-formatted array. QUESTION ASM-14D. The Golden Baron’s x86-64 processor actually supports a secret instruction, swizzleq SRC, REG, which rearranges the nybbles (the hexadecimal digits—the aligned 4-bit slices) of the destination register REG based on the source argument SRC. Here’s some examples. Assuming that %rax holds the value 0x0123456789ABCDEF, the following swizzleq instructions leave the indicated results in %rax: • swizzleq0, %rax: %rax gets 0xFFFF'FFFF'FFFF'FFFF.

The contents of nybble 0 [bits 0-3, inclusive], are repeated into every nybble.

• swizzleq $0xFEDCBA9876543210, %rax: %rax gets0x0123'4567'89AB'CDEF. Each nybble is mapped to its current value: nybble 0 [bits 0-3] is placed in nybble 0 [bits 0-3], nybble 1 in nybble 1, and so forth. • swizzleq$0x0123456701234567, %rax: %rax gets 0xFEDC'BA98'FEDC'BA98.

Nybble 0 [bits 0-3] is placed in nybbles 7 and 15 [bits 28-31 and 60-63]; nybble 1 [bits 4-7] is placed in nybbles 6 and 14 [bits 24-27 and 56-59]; etc.

• swizzleq $0xEFCDAB8967452301, %rax: %rax gets 0x1032'5476'98BA'DCFE. The nybbles of each byte are exchanged. Use swizzleq to shorten your answer for Part C. ## ASM-15. Instruction behavior QUESTION ASM-15A. Name three different x86-64 instructions that always modify the stack pointer, no matter their arguments (instruction names only; suffixes don’t count, so movl and movq are the same instruction name). QUESTION ASM-15B. Name three different x86-64 instructions that sometimes modify the stack pointer, depending on their arguments. QUESTION ASM-15C. Name three different x86-64 instructions that never modify the stack pointer, no matter their arguments. QUESTION ASM-15D. List three different instructions, including arguments, that if placed immediately before a retq instruction that ends a function, will never change the function’s behavior. The instructions should have different names. No funny business: assume the function was compiled from valid C, that relative jumps are fixed up, and that, for example, it doesn’t access its own code. ## ASM-16. Calling convention The following questions concern valid C++ functions compiled using the normal x86-64 calling convention. True or false? QUESTION ASM-16A. If the function’s instructions do not save and restore any registers, then the C++ function did not call any other function. QUESTION ASM-16B. If the function’s instructions do not change the stack pointer, then the function’s instructions do not contain a call instruction. QUESTION ASM-16C. If the function’s instructions do not change the stack pointer, then the C++ function did not call any other function. Explain your answer briefly. QUESTION ASM-16D. If the function’s instructions do not modify the %rax register, then the C++ function must return void. QUESTION ASM-16E. If the function’s instructions store a local variable on the stack, then that variable’s address will be less than the function’s initial stack pointer. ## ASM-17. Assembly Here are three x86-64 assembly functions that were compiled from C. f1: xorl %eax, %eax L2: movsbq (%rdi), %rdx subq$48, %rdx
cmpq    $9, %rdx ja L5 imulq$10, %rax, %rax
incq    %rdi
jmp     L2
L5:
ret
f2:
movq    %rdi, %rax
L7:
cmpb    $0, (%rax) je L9 incq %rax jmp L7 L9: cmpq %rax, %rdi jnb L11 decq %rax movb (%rdi), %cl incq %rdi movb (%rax), %dl movb %cl, (%rax) movb %dl, -1(%rdi) jmp L9 L11: ret f3: xorl %eax, %eax L13: cmpq %rax, %rdx je L15 movb (%rdi,%rax), %cl movb (%rsi,%rax), %r8b movb %r8b, (%rdi,%rax) movb %cl, (%rsi,%rax) incq %rax jmp L13 L15: ret (Note: imulq$10, %rax, %rax means %rax *= 10.)

QUESTION ASM-17A. How many arguments does each function most likely take?

QUESTION ASM-17B. Which functions modify at least one caller-saved register? List all that apply or write “none”.

QUESTION ASM-17C. Which functions never modify memory? List all that apply or write “none”.

QUESTION ASM-17D. Write a signature for each function, giving a likely type for each argument and a likely return type. (You may give a void return type if you think the function doesn’t return a useful value.)

_______ f1(___________________________________)

_______ f2(___________________________________)

_______ f3(___________________________________)

QUESTION ASM-17E. One of these functions swaps the contents of two memory regions. Which one?

QUESTION ASM-17F. What is the value of %rax in f2 the first time L9 is reached? Write a C expression in terms of f2’s argument or arguments; you may use standard library functions.

QUESTION ASM-17G. Give arguments for each function that would result in the function returning without writing to memory or causing a fault.

f1(_____________________)

f2(_____________________)

f3(_____________________)

QUESTION ASM-17H. Complete this function so that it returns the number

1. For full credit, use only calls to f1, f2, and f3. For partial credit, do something simpler.
int magic() {
char s1[] = "Shaka kaSenzangakhona became King of the Zulu Kingdom in 1816";
char s2[] = "Dingane kaSenzangakhona succeeded Shaka in 1828";
char s3[] = "1661 in the Gregorian calendar is 3994 in the Korean calendar";

}


## ASM-18. Kernel assembly

Helen found the following assembly code while hacking a WeensyOS-like kernel. She would like to figure out what the function is doing.

__Z3vmlPmm:
pushq   %rbp
movq    %rsp, %rbp
movq    %rsi, %rax

shrq    $36, %rax andl$4088, %eax
movq    (%rdi,%rax), %rax
testb   $1, %al je error andq$-4096, %rax
movq    %rsi, %rcx
shrq    $27, %rcx andl$4088, %ecx
movq    (%rax,%rcx), %rax
testb   $1, %al je error andq$-4096, %rax
movq    %rsi, %rcx
shrq    $18, %rcx andl$4088, %ecx
movq    (%rax,%rcx), %rax
testb   $1, %al je error andq$-4096, %rax
movq    %rsi, %rcx
shrq    $9, %rcx andl$4088, %ecx
movq    (%rax,%rcx), %rax
testb   $1, %al je error andq$-4096, %rax
andl    $4095, %esi orq %rax, %rsi done: movq %rsi, %rax popq %rbp retq error: xorl %esi, %esi jmp done Hexadecimal reference: 4095 == 0xFFF; 4096 == 0x1000; -4096 == 0xFFFFFFFFFFFFF000; 4088 == 0xFF8. QUESTION ASM-18A. How many arguments does the function take, assuming that there are no unused arguments? QUESTION ASM-18B. What does this function return if it encounters a situation that leads to “error”? Helen realizes that the compiler performed some optimizations that made the code somewhat obscure. For example, these instructions contain constants not present in the C++ source:  shrq$36, %rax
andl    $4088, %eax movq (%rdi,%rax), %rax QUESTION ASM-18C. Which of the following snippets is equivalent to the snippet above? List all that apply. 1.  shrq$33, %rax
andl    $511, %eax movq (%rdi,%rax,8), %rax 2.  shrq$39, %rax
andl    $511, %eax movq (%rdi,%rax,8), %rax 3.  shrq$39, %rax
andl    $511, %eax movq (%rdi,%rax,4), %rax 4.  shrq$33, %rax
andl    $511, %eax movq (%rdi,%rax,4), %rax QUESTION ASM-18D. If rdi and rax were C++ variables corresponding to %rdi and %rax, what would be their likely types? QUESTION ASM-18E. Please write one line of C++ code, using variables rdi and rax, that could compile into the 3 lines of assembly code we are currently investigating. QUESTION ASM-18F. How many bits of this function’s second argument are used? QUESTION ASM-18G. Helen notes that the code contains four copies of a pattern of instructions with different constants. After connecting these constants with CS 61 concepts, she’s able to tell what the function does without deciphering every line of assembly. Use a single concise sentence to describe the high-level functionality of this function. ## ASM-19. Calling convention and optimizations QUESTION ASM-19A. Some aspects of the x86-64 calling convention are conventional, meaning that different operating systems could easily make different choices. Others are more fundamental in that they are built in to the x86-64 instruction set. Which of the following aspects of the calling convention are truly conventional (different operating systems could easily make different choices)? List all that apply. 1. The stack grows down 2. %rsp is a callee-saved register 3. %rbx is a callee-saved register 4. %rbp is a callee-saved register 5. The first argument register is %rdi 6. The second argument register is %rsi 7. The return register is %rax QUESTION ASM-19B. We saw the compiler sometimes place loop variables in callee-saved registers. For example, here: extern unsigned g(unsigned i); unsigned f(unsigned n) { unsigned sum = 0; for (unsigned i = 0; i != n; ++i) { sum += g(i); } return sum; }  the i, n, and sum objects were stored in callee-saved registers %rbx, %r12, and %rbp, respectively. Describe a situation when this choice has a meaningful chance of boosting the overall performance of f. QUESTION ASM-19C. Write two substantively different C++ functions that could compile to the same assembly on x86-64. (“Substantively different” means you can’t just change the function’s name and types.) QUESTION ASM-19D. Which of the following properties of a function could you reliably infer from compiler-generated assembly? List all that apply; assume the function’s name is not provided, and that the function does not call any other functions. Explain briefly if unsure. 1. The types of its arguments 2. Whether it can dereference a pointer argument 3. Whether it can use at least one of its arguments 4. Whether it has unused arguments ## ASM-20. Bignums x86-64 processors have built-in support for integers of up to 64 bits. For larger integers, we must turn to software. This simple “bignum” type represents a 128-bit unsigned integer. struct bignum { unsigned long x, y; }; void hex_print_bignum(bignum n) { if (n.y == 0) { printf("0x%lx", n.x); } else { printf("0x%lx%016lx", n.y, n.x); } }  If bignum bn represents 265, then hex_print_bignum(bn) will print 0x20000000000000000 (that’s a 2 followed by 16 zeros). QUESTION ASM-20A. Does bignum use a big-endian representation, a little-endian representation, or a mix? QUESTION ASM-20B. Complete the following function, which returns true iff a > b. bool bignum_greater(bignum a, bignum b) { }  QUESTION ASM-20C. Complete the following function, which returns the sum a + b. (You may use the fact that if x and y are unsigned integers using computer arithmetic, then the addition z = x + y overflows iff z < x || z < y.) bignum bignum_add(bignum a, bignum b) { }  QUESTION ASM-20D. Complete the same function in x86-64 assembly. Thanks to the calling convention, this function behaves as if the signature were as follows: bignum* bignum_add(bignum* ret, unsigned long a_x, unsigned long a_y, unsigned long b_x, unsigned long b_y); // Stores result in *ret and returns ret.  You may translate your C++ code directly, or you may use the x86-64 CF carry flag and associated instructions to simplify the translation. bignum_add:  ## ASM-21. Disassembly Consider the following function in assembly: _Z3gcdjj: movl %esi, %eax cmpl %esi, %edi jne .L3 jmp .L2 .L8: subl %eax, %edi cmpl %edi, %eax je .L2 .L3: cmpl %edi, %eax jb .L8 subl %edi, %eax cmpl %edi, %eax jne .L3 .L2: ret QUESTION ASM-21A. True or False: 1. The function likely takes 2 arguments. 2. The function accesses 4-byte integer values in primary memory. 3. The function likely works with signed integers. 4. There is likely a loop in the function. 5. The register %eax likely contains a loop index (a value updated by a fixed amount on each loop iteration). QUESTION ASM-21B. Give a input to this function such that it never returns. QUESTION ASM-21C. How would you modify the function, in assembly, such that it eventually returns for all inputs? QUESTION ASM-21D. Give one set of arguments for which the function terminates, and the corresponding return value. ## ASM-22. Buffer overflows _Z4buf1PKc: subq$40, %rsp
movq    %rdi, %rsi
movq    %rsp, %rdi
call    strcpy@PLT
movq    %rsp, %rax
addq    $40, %rsp ret QUESTION ASM-22A. Give argument(s) to function _Z4buf1PKc, including types and values, that would cause a buffer overflow that overwrote at least part of the function’s return address; or explain briefly why this is impossible. _Z4buf2m: subq$24, %rsp
movq    %rdi, %rdx
leaq    16(%rsp), %rdi
leaq    .LC0(%rip), %rsi
movl    $0, %eax call sprintf@PLT movl$0, %eax
.L2:
cmpb    $0, 16(%rsp,%rax) je .L3 incl %eax jmp .L2 .L3: addq$24, %rsp
ret
.LC0:
.string "%zu"

QUESTION ASM-22B. Which object in function _Z4buf2m is referenced using %rip-relative addressing?

QUESTION ASM-22C. Give argument(s) to this function, including types and values, that would cause a buffer overflow that overwrote at least part of the function’s return address; or explain briefly why this is impossible.

QUESTION ASM-22D. It is possible to change two lines of assembly in this function so that the revised function performs the same task, but never causes buffer overflow. Describe how.

## ASM-23. Get–set interfaces

The following questions concern get–set interfaces. These are map-like data structures which support two operations:

void set(container, key, value)
Change container’s value for key to value.

value_type get(container, key)
Returns container’s current value for key. This is the most recent value set by set(container, key, value) or, if there is no previous set, the natural default for the value type (0, nullptr, empty string—whatever’s appropriate).

Here’s the assembly definition of xxx1, which is either get or set for a data structure.

_Z4xxx1Pmm:
movq    (%rdi,%rsi,8), %rax
ret

QUESTION ASM-23A. Is this get or set?

QUESTION ASM-23B. What kind of data structure is this?

QUESTION ASM-23C. What can you tell about the type of key arguments for this data structure?

QUESTION ASM-23D. What can you tell about the type of value arguments?

QUESTION ASM-23E. Write a C++ version of the other operation for the same data structure, including a function signature and any type definitions required.

Here’s the assembly definition of xxx2, which is either get or set for a different data structure.

_Z4xxx2P4nodem:
.L28:
testq   %rdi, %rdi
je  .L30
movq    (%rdi), %rax
cmpq    %rsi, %rax
je  .L35
cmpq    %rax, %rsi
jnb .L27
movq    16(%rdi), %rdi
jmp .L28
.L27:
movq    24(%rdi), %rdi
jmp .L28
.L30:
xorl    %eax, %eax
ret
.L35:
movq    8(%rdi), %rax
ret

QUESTION ASM-23F. Is this get or set?

QUESTION ASM-23G. Write a C++ version of the other operation for the same data structure, including a function signature and any type definitions required. You will need to call malloc or new.