- 1 Assembler: Learning to Read
- 2 Summing Up
- 3 FAQ
- 4 Solution Walkthrough
Assembler: Learning to Read
- Convert C programs into assembly
objdumpto examine the assembly underlying a C function/program (Note:
objdumpis a GNU tool, so you need to use
clangto compile files if you want to use
- Read simple assembly
- Find function parameters in assembly
- Have a good cheat sheet that should help you in your defusing of the bomb.
Pull today's exercise code from the
cs61-exercises repository; we'll be working in the
asm1x directory. We strongly encourage you to use the appliance today -- if you use your laptop, you are likely to get assembly code that looks quite different from what we expect and the observations we ask you to make and questions we ask you may be difficult to answer.
Developing your Assembly Language Cheat Sheet
Over the course of today's and Thursday's exercises, you will develop a cheat sheet, which you may find quite helpful as you work on defusing your bomb. While you work through the following, have a copy of the file
cheatsheet.txt open so you can fill it in.
Registering registers in your brain
A lot of the actual computation that happens in your programs will take place in registers. Why? Because registers are fast! So, let's use GDB to help us remember the register layout of the x86-64 architecture.
cs61-exercises/asm1x directory, you'll find a simple program called
main. Build it (using
make main), and then start
gdb on your main program, and set a breakpoint at main. (We are assuming that after completing assignment 1, you know how to do these things. If you do not, ask a table-mate or raise your hand -- you really want to know how to use
gdb by now.) Now, run the program, and when you come to the breakpoint, display all the registers with the command
info registers (
info r for short). This is how you will display the contents of all your registers. For the most part, you will only be concerned with those that appear before
rip, although you may find
eflags useful. For today, we're concerned only with the general purpose registers, which are those that appear before
Fill in parts A1–A5 of your cheat sheet.
Next, let's examine the code at
main by asking
gdb to disassemble it for us:
disas(semble). Just by reading the code, answer the following questions:
1. Are arguments to main passed the same way that they are into any other function?
2. What line led you to answer the previous question?
3. Where in the address space do you suppose that the function
Passing arguments to functions
Next, dissassemble the function
print_nargs. (There are two ways to do this, see if you can figure out either!)
mov instructions that you see in the code here do what you might expect: they move data into/out of registers (they also move data into/out of memory, but we'll get to that later). In our assembly syntax, they move from operand1 into operand2.
You'll notice that the code does not call a function named
printf, but instead is calling something named
__printf_chk. How many arguments do you think it takes in this case? If you cannot remember how arguments are passed, we've provided the same files that we used in lecture to help you with this. They are in
arg7.c. Build the assembly for these (just type
make). Now, use the assembly produced to fill in questions B.1-B.7 on your cheatsheet. You may find it helpful to keep your cheat sheet handy during the next several classes so you can finish filling it out.
4. How many arguments are being passed to
Let's look at those arguments in a bit more detail. Look at the 3rd instruction of the function. It should look like this:
mov $0x400628, %esi
That first value looks suspiciously like an address -- how can we find out what's at that address? We use the
x (examine) instruction to look at memory by address:
(gdb) x/40c 0x400628
c says that we want to print out the contents of the address as characters; you can print things out in lots of different ways; check out the documentation for details.)
5. What data is stored at the address referenced in that 3rd instruction?
C types in Assembly
In the file
sub.c, you'll find a simple function that takes two arguments, arg1 and arg2, and computes arg1 - arg2.
See how many different functions you can write that produce exactly the same assembly code. Spend no more than five minutes on this. Experiment with different parameter types and different ways of instructing the compiler to produce code that subtracts two numbers. (If you create files named things like
sub2.c, etc, then typing
make will produce assembly for them.)
Clever Compilers and Multiplication Operations
Take a look at the file
6. Based on what you've seen so far in class, predict what assembly code will be generated for this function.
Now, make the
7. What is the new instruction you see in the code? Can you figure out what it's doing? If you get stuck, Google the instruction and x86-64 or assembly and see what you can find.
Once you've figured out how this program is working, predict what will happen if you change the number 16 in
mul.c to 128. Produce the assembly code and see if you are right.
Next, change the type in
mul.c from long to int. Make a guess what the assembly should look like and then check your guess.
8. How did the assembly change when you changed the type in C?
Check your knowledge of Assembly
Foreach of the files mystery1.S, mystery2.S, mystery3.S, and mystery4.S, see if you can write C code to generate identical assembly! If you encounter instructions whose meaning you do not understand, try googling! Mystery4 uses several features that we have not yet gone over, so if you don't understand it, don't worry, but if you get that far, have some fun and see if you can figure it out.
- You can read assembly for simple programs!
- You know how assembly instructions express arithmetic and logical and shift operators
- You know how arguments are passed in assembly language
Q1: Why are some of the functions bracketed by:
subq $8, %rsp ... addq $8, %rsp
A1: As you may recall from last Thursday's lecture, local space for functions is allocated on the stack. The subtraction provides 8 bytes of space for the function to use and the addition returns that stack space. In the code examples you examined, this space was necessary when the function called another function, because the call uses the stack to restore the address to which the function should return. We will go into this in much more depth next Tuesday.
The last three slides cover the mystery functions, which are probably the most useful!