Miscellaneous exercises

Exercises not as directly relevant to this year’s class are marked with ⚠️.

MISC-1. Git

Edward Snowden is working on a CS61 problem set and he has some git questions.

QUESTION MISC-1A. The CS61 staff has released some new code. Which commands will help Edward get the code from code.seas.harvard.edu into his repository? Circle all that apply.

  1. git commit
  2. git add
  3. git push
  4. git pull

QUESTION MISC-1B. Edward has made some changes to his code. He hasn’t run git since making the changes. He wants to upload his latest version to code.seas.harvard.edu. Put the following git commands in an order that will accomplish this goal. You won’t necessarily use every command. You may add flags to a command (but you don’t have to). If you add flags, tell us what they are.

  1. git commit
  2. git add
  3. git push
  4. git pull

Edward Snowden’s partner, Edward Norton, has been working on the problem set also. They’ve been working independently.

At midnight on October 10, here’s how things stood. The git log for the partners’ shared code.seas.harvard.edu repository looked like this. The committer is listed in (parentheses).

52d44ee Pset release. (kohler)

The git log for Snowden’s local repository:

3246d07 Save Greenwald's phone number (snowden)
8633fbd Start work on a direct-mapped cache (snowden)
52d44ee Pset release. (kohler)

The git log for Norton’s local repository:

81f952e try mmap (norton)
52d44ee Pset release. (kohler)

At noon on October 11, their shared GitHub repository has this log:

d446e60 Increase cache size (snowden)
b677e85 use mmap on mmappable files (norton)
b46cfda Merge branch 'master' of code.seas.harvard.edu:~TheTrueHOOHA/cs61/TheTrueHOOHAs-cs61-psets.git 
        (norton)
81f952e try mmap (norton)
3246d07 Save Greenwald's phone number (snowden)
8633fbd Start work on a direct-mapped cache (snowden)
52d44ee Pset release. (kohler)

QUESTION MISC-1C. Give an order for these commands that could have produced that log starting from the midnight October 10 state. You might not use every command, and you might use some commands more than once. Sample (incorrect) answer: “1 4 4 5 2.”

  1. snowden: git commit -a
  2. snowden: git push
  3. snowden: git pull
  4. norton: git commit -a
  5. norton: git push
  6. norton: git pull

QUESTION MISC-1D. In your answer to Part C, circle the step(s) where there might have been a merge conflict.

MISC-2. Debugging

QUESTION MISC-2A. Match each tool or technique with a debugging situation for which it is well suited. Produce the best overall match that uses each situation exactly once.

1. strace A. Investigating segmentation faults
2. gdb B. Finding memory leaks
3. valgrind --tool=memcheck C. Checking your assumptions and verifying invariants
4. printf statements D. Discovering I/O patterns
5. assert E. Displaying program state

MISC-3. Pot Pourri

QUESTION MISC-3A. What does the following instruction place in %eax?

sarl $31, %eax

QUESTION MISC-3B. True/False: A direct-mapped cache with N slots can handle any reference string with < N distinct addresses with no misses except for compulsory misses.

QUESTION MISC-3C. What is 1 (binary) TB in hexadecimal?

QUESTION MISC-3D. Write the answer to the following in hexadecimal:

0xabcd + 12

QUESTION MISC-3E. True/False: The garbage collector we discussed is conservative, because it only runs when we tell it to.

QUESTION MISC-3F. True/False: Given the definition int array[10] the following two expressions mean the same thing: `&array[4] and array

QUESTION MISC-3G. Using the matrix multiply from lecture 12, in what order should you iterate over the indices i, j, and k to achieve the best performance.

QUESTION MISC-3H. True/False: fopen, fread, fwrite, and fclose are system calls.

QUESTION MISC-3I. Which do you expect to be faster on a modern Linux OS, insertion sorting into a linked list of 1000 elements or into an array of 1000 elements?

QUESTION MISC-3J. What does the hardware do differently when adding signed versus unsigned numbers?

MISC-4. Debugging

In the following short-answer questions, you have access to five debugging tools: top, strace, gdb, sanitizers, and man. You can’t change program source code or use other tools. Answer the questions briefly (a couple sentences at most).

QUESTION MISC-4A. You are given a program that appears to “get stuck” when run. How would you distinguish whether the program blocked forever (e.g., made a system call that never returned) or entered an infinite loop?

QUESTION MISC-4B. You are given a program that uses more memory while running than you expect. How would you tell whether the program leaks memory?

QUESTION MISC-4C. You are given a program that produces weird answers. How would you check if it invoked undefined behavior?

QUESTION MISC-4D. You are given a program that blocks forever. How would you tell where the program blocked (which function called the blocking system call)?

QUESTION MISC-4E. You are given a program that takes a long time to produce a result. How would you tell whether the program was using system calls unintelligently?

QUESTION MISC-4F. You are given a program that exits with a system call error, but doesn’t explain what happened in detail. How would you find what error condition occurred and understand the conditions that could cause that error?

MISC-5. Miscellany

QUESTION MISC-5A. True or false in conventional Unix systems?

  1. File descriptors are often used to communicate among processes on the same machine.
  2. File descriptors are often used to communicate among processes on different machines.
  3. File descriptors are often used to communicate with persistent storage.
  4. File descriptors are often used to access primary memory.
  5. File descriptors are often used to create child processes.

QUESTION MISC-5B. Match each numbered process isolation feature with the lettered hardware feature that helps enforce it. Use each hardware feature once (make the best match you can).

  1. Protected control transfer (processes can transfer control to the kernel only at defined entry points)
  2. Memory protection (one process cannot modify another process’s memory)
  3. Interrupt protection (only the kernel can disable interrupts)
  4. CPU protection (the kernel always regains control of the CPU eventually)
  1. Traps
  2. Privileged mode (dangerous instructions fault unless the CPU is in privileged mode)
  3. Timer interrupts
  4. Page tables

The remaining questions refer to the following lines of code.

1.  close(fd);
2.  connect(fd, sockaddr, socklen);
3.  listen(fd);
4.  mmap(nullptr, 4096, PROT_READ, MAP_SHARED, fd, 0);
5.  read(fd, buf, 4096);
6.  write(fd, buf, 4096);

QUESTION MISC-5C. If a program executes the following line without error, which of those lines could be executed next without error? List all numbers that apply.

fd = open("/home/cs61user/cs61-psets/pset6/pong61.c", O_RDWR);

QUESTION MISC-5D. If a program executes the following line without error, which lines could be executed next without error? List all numbers that apply.

fd = socket(AF_INET, SOCK_STREAM, 0);

QUESTION MISC-5E. If a program executes the following lines without error, which lines could be executed next without error? List all numbers that apply.

pipe(pipefd);
fd = pipefd[0];

MISC-6. More Miscellany

QUESTION MISC-6A. True or false: Any C++ arithmetic operation has a well-defined result.

QUESTION MISC-6B. True or false: Any x86-64 processor instruction has a well-defined result.

QUESTION MISC-6C. True or false: By executing a trap instruction, a process can force an operating system kernel to execute arbitrary code.

QUESTION MISC-6D. True or false: By manipulating process memory and registers, an operating system kernel can force a process to execute arbitrary instructions.

QUESTION MISC-6E. True or false: All signals are sent explicitly via the kill() system call.

QUESTION MISC-6F. True or false: An operating system’s buffer cache is generally fully associative.

QUESTION MISC-6G. True or false: The least-recently-used eviction policy is more useful for very large files that are read sequentially than it is for stacks.

QUESTION MISC-6H. True or false: Making a cache bigger can lower its hit rate for a given workload.

QUESTION MISC-6I. True or false: x86-64 processor caches are coherent (i.e., always appear to contain the most up-to-date values).

QUESTION MISC-6J. True or false: A socket file descriptor supports either reading or writing, but not both.

MISC-7. Pot Pourri

Parts A-D pertain to the data structures and hexdump output shown here.

struct x {
    unsigned long ul;
    unsigned short us;
    unsigned char uc;
} *sp;
// Hexdump output of some program running on the appliance
08c1b008  e9 11 cf d0 0d d0 3f f3  63 61 74 00 0d f0 fe ca  |......?.cat.....|
08c1b018  5e ea 15 0d de c0 ad de                           |^.......|

You are told that sp = 0x08c1b008.

QUESTION MISC-7A. What is the value (in hex) of sp->ul?

QUESTION MISC-7B. What is the value (in hex) of sp->uc?

QUESTION MISC-7C. At what address will you find the string "cat"?

QUESTION MISC-7D. If the bytes after the string "cat" comprise an array of 3 integers, what is the value (in hex) of the integer at index 1 of that array?

QUESTION MISC-7E. What is the following binary value expressed in hexadecimal: 01011010?

QUESTION MISC-7F. What is the value of the hex number 0x7FF in decimal?

QUESTION MISC-7G. Is 0x98765432 a valid return from malloc?

QUESTION MISC-7H. What is the minimum number of x86 instruction bytes you need to write an infinite loop?

QUESTION MISC-7I. True or False: Every declaration in C++ code allocates space for an object.

QUESTION MISC-7J. True or False: Processes cannot share physical memory in WeensyOS.


For parts K–O, assume we are running on the appliance and we initialize ival, p, and q as shown below. Write the value of the expression—you may express the values in hex if that's simpler, just be sure to prefix them with 0x to make it clear that you are doing so. For True/False questions, there is no need to correct or provide a counterexample for any statements that are false.

int ival[4] = {0x12345678, 0x9ABCDEF0, 0x13579BDF, 0x2468ACE0};
int* p = &ival[0];
int* q = &ival[3];
int* x = p + 1;
char* cp = (char*) (q - 2);

QUESTION MISC-7K. q - p

QUESTION MISC-7L. ((char*) q - (char*) p)

QUESTION MISC-7M. x - p

QUESTION MISC-7N. *((short*) ((char*) x + 2))

QUESTION MISC-7O. *cp


QUESTION MISC-7P. What system call allows you to block on a collection of file descriptors?

QUESTION MISC-7Q. What system call creates a communication channel that can only be used among related processes?

QUESTION MISC-7R. ⚠️ What system call can change the attributes of a file descriptor so you can poll on it rather than block?

QUESTION MISC-7S. What system call produces a file descriptor on which a server can exchange messages with a client?

QUESTION MISC-7T. True or False: A program and a process are the same thing.

MISC-8. CS61 in Real Life

QUESTION MISC-8A. The CS61 Staff have built a jet (the NightmareLiner) modeled on the Boeing Dreamliner. Unfortunately, they modeled it just a bit too closely on the Dreamliner, which needs to be rebooted periodically to avoid failure. In the case of the NightmareLiner, it needs to be rebooted approximately every 16 days. Your job is to use what you've learned in CS61 about data representation to hypothesize why.

Hint: There are 86,400,000 ms in a day. 86,400,000 is between 226 and 227.


Google recently discovered (and reported) a bug in the GNU libc implementation of getaddrinfo. This function can perform RPC calls, which involve sending and receiving messages. In some cases, getaddrinfo failed to check that a received message could fit inside a buffer variable located on the stack (2048 bytes).

QUESTION MISC-8B. True or false: This flaw means getaddrinfo will always execute undefined behavior.

QUESTION MISC-8C. Give an example of a message that will cause getaddrinfo to exhibit undefined behavior.

QUESTION MISC-8D. Briefly describe the contents of a message that would cause the getaddrinfo function to return to address 0x400012988 rather than to its caller.


QUESTION MISC-8E. This code used to appear in the Linux kernel:

1. struct tun_struct *tun = ...;   // This is a valid assignment;
2. struct sock *sk = tun->sk;
3. if (!tun)
4.    return POLLERR;              // This is an error return

The compiler removed lines 3 and 4. Why was that a valid thing for the compiler to do?

MISC-9. Miscellany

QUESTION MISC-9A. Name the property that implies a process cannot cause the kernel to execute code at an arbitrary address.

QUESTION MISC-9B. True or false: It’s safe to call any C library function from a signal handler.

QUESTION MISC-9C. True or false? Pointer arithmetic is only valid on pointers that point into arrays. Explain briefly.

QUESTION MISC-9D. True or false? All x86-64 fundamental data types have alignments that are powers of 2. Explain briefly.

QUESTION MISC-9E. True or false? Pointer arithmetic on char* pointers behaves the same way as address arithmetic (i.e., arithmetic on uintptr_t unsigned integers), including with respect to undefined behavior. Explain briefly.

QUESTION MISC-9F. True or false? System calls are just as fast as function calls. Explain briefly.

MISC-10. Listing All That Apply

Assume an underlying architecture of x86-64 for all questions.

QUESTION MISC-10A. Undefined behavior: list all that apply.

  1. Undefined behavior is a property of the C abstract machine.
  2. Undefined behavior is a property of the x86-64 architecture.
  3. Undefined behavior is dangerous.
  4. Unsigned overflow causes undefined behavior.
  5. None of the above.

QUESTION MISC-10B. Pointer arithmetic: list all that apply. Assume int* x = new int[10].

  1. sizeof(x) == 40.
  2. The allocation pointed to by x contains at least 40 bytes.
  3. The statement int* y = x + 10 causes undefined behavior.
  4. The statement int* y = x + 11 causes undefined behavior.
  5. None of the above.

QUESTION MISC-10C. Size and alignment: list all that apply.

  1. Pointers have size 8.
  2. sizeof(int) == 4.
  3. sizeof(T) is a multiple of alignof(T) for all T.
  4. Given an array T x[N], sizeof(x) == sizeof(T) * N.
  5. None of the above.

QUESTION MISC-10D. Registers: list all that apply.

  1. Every function must restore all processor registers to their original values on exit.
  2. Function return values are stored on the stack.
  3. %rax and %eax refer to (parts of) the same register.
  4. %rbp and %ebx refer to (parts of) the same register.
  5. None of the above.

QUESTION MISC-10E. Return address: list all that apply.

  1. A function’s return address is known at compile time.
  2. A function’s return address is stored on the stack.
  3. A function’s return address is protected from modification by the processor.
  4. A function’s return address is the address of the call instruction that invoked the function.
  5. None of the above.

QUESTION MISC-10F. Stdio cache: list all that apply.

  1. The stdio cache has size 10000 bytes by default.
  2. The stdio cache behaves similarly for pipes, disk files, and the terminal.
  3. The stdio cache is stored in the buffer cache.
  4. The stdio cache is stored in registers.
  5. None of the above.

QUESTION MISC-10G. Protected control transfer: list all that apply.

  1. Protected control transfer refers to the transfer of control from an unprivileged process to the kernel.
  2. An unprivileged process can initiate a protected control transfer that executes any instruction in machine memory.
  3. Executing a protected control transfer is about as expensive as calling a function.
  4. System calls are typically initiated by faults.
  5. None of the above.

QUESTION MISC-10H. Virtual memory: list all that apply.

  1. Virtual memory requires processor support.
  2. Virtual memory involves page tables.
  3. Entries in page table data structures contain virtual addresses.
  4. Entries in page table data structures include flags that are not related to (i.e., part of) addresses.
  5. None of the above.

QUESTION MISC-10I. System calls: list all that apply.

  1. waitpid might or might not block depending on its arguments.
  2. pipe might or might not block depending on its arguments.
  3. The kill system call relates to signals.
  4. The dup2 system call can increase the number of open file descriptors in a process.
  5. None of the above.

QUESTION MISC-10J. Pipes and file descriptors: list all that apply.

  1. All of a process’s file descriptors that refer to the same file share the same file position.
  2. All processes in a pipeline generally share one standard input.
  3. All processes in a pipeline generally share one standard error.
  4. STDIN_FILENO == 0.
  5. None of the above.

QUESTION MISC-10K. Threads and address spaces: list all that apply.

  1. Threads within the same process can share the same address space.
  2. Threads from different processes can share the same address space.
  3. Each thread has its own stack and set of registers.
  4. Accessing a thread’s stack from a different thread raises a segmentation fault.
  5. None of the above.