This is not the current version of the class.

Assembly Section 1: Fun

In this section we’re going to have fun.

Simultaneously enrolled college students are responsible for attending section live (in-person or via zoom). If you cannot attend any section live in a particular week, you may watch a recording (Canvas > Zoom > Cloud Recordings) instead. Please self-report your live attendance (or that you watched a recording) using this attendance form.

Update your section repository, cd asms1, and make. This will build a number of fun programs.

Setup

Let’s run one:

$ ./fun01
😿😿😿😿😿😿😿😿 no fun 😿😿😿😿😿😿😿😿

That wasn’t fun!

These programs are puzzles. Look at fundriver.cc for the ground rules. The driver’s main function first creates a single C string that contains all program arguments, separated by spaces. It then calls the fun function, passing in that string. The fun function returns an integer; if fun(str) returns 0, then the driver has fun, and if it returns anything else, no fun is had (the function no_fun() is called, which prints the no fun message).

We want to have fun, how can we have fun? Might as well see what the function does! (Open fun01.cc)

Looks like this fun function will return 0 if and only if the arguments contain an exclamation point. Let’s test that:

$ ./fun01 !
πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŒ½πŸŒ½πŸŒ½πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰
                 FUN
πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŒ½πŸŒ½πŸŒ½πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰
$ ./fun01 'yay!'
πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŒ½πŸŒ½πŸŒ½πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰
                 FUN
πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŒ½πŸŒ½πŸŒ½πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰
$ ./fun01 'amazing!!!!!!!!!!!!!!!!!'
πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŒ½πŸŒ½πŸŒ½πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰
                 FUN
πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŒ½πŸŒ½πŸŒ½πŸŽŠπŸŽŠπŸŽŠπŸŽŠπŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰
$ ./fun01 'amazing?'
😿😿😿😿😿😿😿😿 no fun 😿😿😿😿😿😿😿😿

GDB

The idea of not having fun is deeply painful. So is there any way that you could prevent the no_fun() function from running? That you could stop the program if it reached no_fun()?

This is a debugger breakpoint. A debugger is a program that manages the execution of another program. It lets you run a program, stop it, and examine variables, registers, and the contents of memory. Among the most powerful debugger features is the ability to stop a program if it ever reaches an instruction. This is called β€œsetting a breakpoint”: the breakpoint marks a location that, when reached, β€œbreaks” the program and returns control to the debugger.

How would you stop the program from executing no_fun()?

$ gdb fun01
(gdb) b no_fun

Now if we run the program with non-fun arguments

(gdb) r

we will stop before printing β€œno fun”!

If you’re not careful, though, it’s possible to accidentally step through and print the message. You can do this one step at a time (demo r, followed by several ses); or you can do it by continuing the program by accident (demo r followed by c).

What if you wanted to make this kind of accident wicked unlikely? Well, you could set more breakpoints!

(gdb) b no_fun
(gdb) r
Breakpoint 1, no_fun () at fundriver.cc:15
15      std::cerr << "😿😿😿😿😿😿😿😿 no fun 😿😿😿😿😿😿😿😿\n";
(gdb) x/20i $pc
=> 0x4000001305 <main(int, char**)+389>:  lea    0xd5e(%rip),%rsi        # 0x400000206a
   0x400000130c <main(int, char**)+396>:  lea    0x2e4d(%rip),%rdi        # 0x4000004160 <_ZSt4cerr@GLIBCXX_3.4>
   0x4000001313 <main(int, char**)+403>:  call   0x4000001130
   0x4000001318 <main(int, char**)+408>:  mov    $0x1,%edi
   0x400000131d <main(int, char**)+413>:  call   0x4000001150
   0x4000001322 <main(int, char**)+418>:  endbr64 
   0x4000001326 <main(int, char**)+422>:  mov    %rax,%rbx

We’ve stopped at the first instruction in the no_fun function. But there’s nothing stopping us from setting more breakpoints! For example, at the second instruction and the third:

(gdb) b *0x400000130c
Breakpoint 2 at 0x400000130c: file fundriver.cc, line 15.
(gdb) b *0x4000001313
Breakpoint 3 at 0x4000001313: file fundriver.cc, line 15.

But what if you forgot these breakpoints after starting GDB??? That’s a good case for .gdbinit, a file of GDB commands that runs every time you start GDB.

GDB cheatsheet

Here begins a quick overview of interesting GDB commands. The commands are linked to their descriptions in the GDB manual, which also describes many more amazing commands.

Execution commands

Command Description
run (r) Execute file passed as command line argument to gdb
You can supply arguments to r; if none, uses the last set passed
M1 mac users will enter a different command
break (b) Pause execution when a particular point in the code is reached
Examples: break FILENAME:LINE, break FUNCTION, break FILENAME:FUNCTION
watch Pause when the value of an expression changes
continue (c) Run until the next breakpoint
step (s) Steps to the next line of code (enters function calls)
next (n) Steps to the next line of code (steps over function calls)
stepi (si) Steps to the next instruction (enters function calls)
nexti (ni) Steps to the next instruction (steps over function calls)
finish Runs until the current function returns
advance LINE Runs until a given line of code
info breakpoints List breakpoints
delete N (d) Delete a breakpoint by number
kill (k) Kill the currently-running program
quit (q) Quit GDB

Examination commands

Command Description
x ADDR (examine) Examine memory at a given address
Examples: x/dw $rax (print in decimal format [d] the 4-byte int [w] starting at %rax;
x/10xg $rsi (print in hex the 10 unsigned longs [g] starting at %rsi;
x/10i $rip
print EXPR (p) Print the value of a register or C++ expression
display EXPR (disp) Like print, but prints each time a step is taken
disassemble (disas) Output assembly instructions
Examples: disas FUNCTION, disas ADDR1,ADDR2, disas ADDR,+LENGTH
list (l) Show source code around the current instruction pointer
backtrace (bt) Print the call stack
frame NUM (f) Examine the context of frame number NUM (so you can see caller variables, for example)
up (u) Move up to the caller frame
down (d) Move down to the callee frame
thread N Change thread context in a multithreaded program
info registers Show registers

Control commands

Command Description
tui enable Enable the TUI, which shows code and control in separate β€œpanels”
layout next Change the TUI layout. Also try layout help
Ctrl-X Control the TUI. Ctrl-X 1 shows two panels, Ctrl-X 2 shows three, Ctrl-X o moves focus
Ctrl-L Refresh the screen (use if things look janky)
set confirm off Stop warning about killing programs
add-auto-load-safe-path DIRECTORY Put in your ~/.gdbinit file; tells GDB to read the DIRECTORY/.gdbinit file if it starts in DIRECTORY

gdb cheatsheet: http://darkdust.net/files/GDB%20Cheat%20Sheet.pdf

Many more GDB commands exist! Time spent learning a debugger is time well spent. On modern GDBs you can even run code backwards.

Other programs

The LLDB debugger is better supported on Mac OS than GDB. Most GDB commands work on LLDB as well.
lldb cheatsheet: https://lldb.llvm.org/lldb-gdb.html

The objdump program is useful for printing out properties of an executable. objdump -t prints out the program’s symbol table, which includes the names of all functions and global variables in the executable, the names of all the functions the executable calls, and their addresses (though addresses may change when the executable is run). objdump -d and objdump -S disassemble all the code in an executable.

More fun

Now let’s work through a couple more funs. We’ll try to understand the operation of the funs using GDB and assembly, though for the first 6 funs, the C++ is there if you get stuck.

ASSEMBLY IS HARD. And trying to understand assembly from first principles, without running it, is really hard! As with many aspects of systems, you will have more luck with an approach motivated by experimental science. Try and guess at an input that will work, using cues from the assembly. Develop a hypothesis and test it. For the bomb pset, you don’t need to fully understand the assembly, you just need to find an input that passes each phase. (That said, you will often end up understanding the assemblyβ€”but only after completing the phase with the help of experiments.)

It is also often effective to alternate between working top down, starting from the entry to a function, and bottom up, starting at the return statement. Working from the bottom up, you can eliminate error paths and trace through how the desired result is calculated. Working from the top down, you can develop hypotheses about how the input should look. As long as you have breakpoints set, you can experiment with a free and easy heart. (And if the bomb goes off, who really cares?)