CS61 Midterm 2014 Solutions
Rules
You have 83 minutes to complete this midterm.
Do your work in the exam booklet. Use the backs of pages if you need to, but write your answers on the fronts.
Some problems are harder than others and they are not ordered by difficulty. Look over all the problems and use your time wisely. You have about 2.3 minutes per problem part.
The exam is open book, open note, open computer. You may access the book, and your own notes in paper form. You may also use a computer or equivalent to access class materials. However, you may not access non-class materials except as explicitly allowed below, and you may not write programs to answer questions. Specifically:
- You may access a browser and a PDF reader.
- You may access your own notes electronically.
- You may access an Internet site on which your own notes are stored.
- You may access system manual pages (
man
). - You may access the cs61.seas.harvard.edu/wiki/2014 site.
But:
- You may not access Google or Wikipedia or anything else except as directly linked from the cs61.seas.harvard.edu/wiki/2014 site.
- You may not access Piazza.
- You may not access course videos.
- You may not run a C compiler, assembler, on-line disassembler, calculator, or anything similar. Simple reading applications only.
- You may not access other students’ notes.
- You absolutely may not contact other humans via IM or anything like it.
- You may not access solutions from any previous exam, by paper or computer, except for those linked on the course wiki.
Any violations of this policy, or the spirit of this policy, are breaches of academic honesty and will be treated accordingly. Please appreciate our flexibility and behave honestly and honorably.
Name
Your name: | ____________________________________________________________________ |
Your student ID (HUID/DCE ID): | ____________________________________________________________________ |
You must sign the following statement for us to grade the exam:
I have neither given nor received inappropriate help on this exam.
_________________________________________________________________ __________________________ Signature Date
0. Ground rules
Assume a 32-bit x86 architecture unless explicitly told otherwise.
Write your assumptions if a problem seems unclear, and write down your reasoning for partial credit.
1. Data representation (12 points)
QUESTION 1A. Arrange the following values in increasing numeric
order. Assume that x
is an int
with value 8192.
|
|
A possible answer might be “a < b < c = d < e < f < g < h.”
4 points
h < a = d = f < b < c < e < g
For each of the remaining questions, write one or more arguments that, when passed to the provided function, will cause it to return the integer 61 (which is 0x3d hexadecimal). Write the expected number of arguments of the expected types.
QUESTION 1B.
int f1(int n) {
return 0x11 ^ n;
}
2 points. 0x2c == 44
QUESTION 1C.
int f2(const char* s) {
return strtol(s, NULL, 0);
}
2 points. "61"
, " 0x3d"
, " 61 "
, etc.
QUESTION 1D. Your answer should be different from the previous answer.
int f3(const char* s) {
return strtol(s, NULL, 0);
}
2 points
QUESTION 1E. For this problem, you will also need to define a global variable. Give its type and value.
f4:
movl 4(%esp), %eax # 4(%esp) is the first argument to the function
andl $5, %eax
addl %eax, %eax
addl 8(%esp), %eax # 8(%esp) is the second argument to the function
movzbl y, %edx
subl %edx, %eax
ret
2 points. This code was compiled from:
int f4(int a, int b) {
extern unsigned char y;
return (a & 5) * 2 + b - y;
}
A valid solution is a=0, b=61, unsigned char y
=0.
2. Sizes and alignments (12 points)
QUESTION 2A. Use the following members to create a struct of size
16. You will use each member exactly once, and char a
comes first.
char a;
(we’ve written this for you)unsigned char b;
short c;
int d;
struct size_16 {
char a;
};
Impossible! Bad question; we should have worded it to allow “impossible” to be an expected answer.
QUESTION 2B. Repeat Question 2A, but create a struct with size 12.
struct size_12 {
char a;
};
2 points. abdc, acbd, acdb
QUESTION 2C. Repeat Question 2A, but create a struct with size 8.
struct size_8 {
char a;
};
4 points. abcd
QUESTION 2D. Consider the following structs:
struct x {
T x1;
U x2;
};
struct y {
struct x y1;
V y2;
};
Give definitions for T, U, and V so that there is one byte of padding in
struct x
after x2
, and two bytes of padding in struct y
after
y1
.
4 points. Example: T = short[2]; U = char; V = int
3. Dynamic memory allocation (16 points)
QUESTION 3A. True or false?
free(NULL)
is an error.malloc(0)
can never returnNULL
.
3 points. False, False
QUESTION 3B. Give values for sz
and nmemb
so that
calloc(sz, nmemb)
will always return NULL
(on a 32-bit x86 machine),
but malloc(sz * nmemb)
might or might not return null.
3 points. (size_t) -1, (size_t) -1
—anything that causes an overflow
Consider the following 8 statements. (p
and q
have type char*
.)
free(p);
free(q);
p = q;
q = NULL;
p = (char*) malloc(12);
q = (char*) malloc(8);
p[8] = 0;
q[4] = 0;
QUESTION 3C. Put the statements in an order that would execute without error or evoking undefined behavior. Memory leaks count as errors. Use each statement exactly once. Sample answer: “abcdefgh.”
POINT TOTALS: 3C-F are graded together.
- 10 points for 4/4 correct
- 9 points for 3/4 correct
- 7 points for 2/4 correct
- 4 points for 1/4 correct
- Additional partial credit is allowed
cdefghab (and others). Expect “OK”
QUESTION 3D. Put the statements in an order that would cause one double-free error, and no other error or undefined behavior (except possibly one memory leak). Use each statement exactly once.
efghbcad (and others). Expect “double-free + memory leak”
QUESTION 3E. Put the statements in an order that would cause one memory leak (one allocated piece of memory is not freed), and no other error or undefined behavior. Use each statement exactly once.
efghadbc (and others). Expect “memory leak”
QUESTION 3F. Put the statements in an order that would cause one boundary write error, and no other error or undefined behavior. Use each statement exactly once.
eafhcgbd (and others). Expect “out of bounds write”
4. Pointers and debugging allocators (12 points)
You are debugging some students’ m61
code from Problem Set 1. The
codes use the following metadata:
typedef struct meta { ...
struct meta* next;
struct meta* prev;
} meta;
meta* mhead; // head of active allocations list
Their linked-list manipulations in m61_malloc
are similar.
void* m61_malloc(size_t sz, const char* file, int line) {
...
meta* m = (meta*) ptr;
m->next = mhead;
m->prev = NULL;
if (mhead)
mhead->prev = m;
mhead = m;
...
}
But their linked-list manipulations in m61_free
differ.
Alice’s code:
|
Bob’s code:
|
Chris’s code:
|
Donna’s code:
|
You may assume that all code not shown is correct.
QUESTION 4A. Whose code will segmentation fault on this input? List all students that apply.
int main() {
void* ptr = malloc(1);
free(ptr);
}
Grade 4A-4D together. Point assignments:
- 4/4 correct: 12 points
- 3/4 correct: 10 points
- 2/4 correct: 7 points
- 1/4 correct: 4 points
Chris
QUESTION 4B. Whose code might report something like
“invalid free of pointer [ptr1], not allocated
” on this input?
(Because a list traversal starting from mhead
fails to find ptr1
.)
List all students that apply. Don’t include students whose code would
segfault before the report.
int main() {
void* ptr1 = malloc(1);
void* ptr2 = malloc(1);
free(ptr2);
free(ptr1); // <- message printed here
}
Alice
QUESTION 4C. Whose code would improperly report something like
“LEAK CHECK: allocated object [ptr1] with size 1
” on this input?
(Because the mhead
list appears not empty, although it should be.)
List all students that apply. Don’t include students whose code would
segfault before the report.
int main() {
void* ptr1 = malloc(1);
free(ptr1);
m61_printleakreport();
}
Bob
QUESTION 4D. Whose linked-list code is correct for all inputs? List all that apply.
Donna
5. Arena allocation (15 points)
Chimamanda Ngozi Adichie is a writing a program that needs to allocate and free a lot of nodes, where a node is defined as follows:
typedef struct node {
int key;
void* value;
struct node* left;
struct node* right; // also used in free list
} node;
She uses a variant of the arena allocator we saw in Lectures 5 and 6. Here’s part of the code.
typedef struct arena_group {
struct arena_group* next_group;
node nodes[1024];
} arena_group;
typedef struct arena {
node* frees;
arena_group* groups;
} arena;
node* node_alloc(arena* a) {
if (!a->frees) {
arena_group* g = (arena_group*) malloc(sizeof(arena_group));
` // ... link `g` to `a->groups` ... `
for (size_t i = 0; i != 1023; ++i)
g->nodes[i].right = &g->nodes[i + 1];
g->nodes[1023].right = NULL;
a->frees = &g->nodes[0];
}
node* n = a->frees;
a->frees = n->right;
return n;
}
void node_free(arena* a, node* n) {
n->right = a->frees;
a->frees = n;
}
QUESTION 5A. True or false?
- This allocator never has external fragmentation.
- This allocator never has internal fragmentation.
3 points. True, True
QUESTION 5B. Chimamanda’s frenemy Paul Auster notices that if many nodes are allocated right in a row, every 1024th allocation seems much more expensive than the others. The reason is that every 1024th allocation initializes a new group, which in turn adds 1024 nodes to the free list. Chimamanda decides instead to allow a single element of the free list to represent many contiguous free nodes. The average allocation might get a tiny bit slower, but no allocation will be much slower than average. Here’s the start of her idea:
node* node_alloc(arena* a) {
if (!a->frees) {
arena_group* g = (arena_group*) malloc(sizeof(arena_group));
` // ... link `g` to `a->groups` ... `
g->nodes[0].key = 1024; // g->nodes[0] is the 1st of 1024 contiguous free nodes
g->nodes[0].right = NULL;
a->frees = &g->nodes[0];
}
node* n = a->frees;
// ???
return n;
}
Complete this function by writing code to replace // ???
.
3 points
if (n->key == 1)
a->frees = n->right;
else {
a->frees = n + 1;
a->frees->key = n->key - 1;
a->frees->right = n->right;
}
Another solution:
if (n->right)
a->frees = n->right;
else if (n->key == 1)
a->frees = NULL;
else {
a->frees = n + 1;
a->frees->key = n->key - 1;
}
QUESTION 5C. Write a node_free
function that works with the
node_alloc
function from the previous question.
void node_free(arena* a, node* n) {
3 points
n->right = a->frees;
n->key = 1;
a->frees = n;
Or, if you use the solution above:
n->right = a->frees;
a->frees = n;
}
QUESTION 5D. Complete the following new function.
` // Return the arena_group containing node `n`. `n` must be a node returned by `
` // a previous call to `node_alloc(a)`. `
arena_group* node_find_group(arena* a, node* n) {
for (arena_group* g = a->groups; g; g = g->next_group) {
3 points
if (&g->nodes[0] <= n && n <= &g->nodes[1023])
return g;
}
return NULL;
}
QUESTION 5E. Chimamanda doesn’t like that the node_find_group
function from Question 5D takes O(G) time, where G
is the number of allocated arena_groups. She remembers a library
function that might help, posix_memalign
:
int posix_memalign(void** memptr, size_t alignment, size_t size);
The function posix_memalign() allocates size
bytes and places the
address of the allocated memory in *memptr
. The address of the
allocated memory will be a multiple of alignment
, which must be a
power of two and a multiple of sizeof(void *)
. ...
“Cool,” she says, “I can use this to speed up node_find_group
!” She
now allocates a new group with the following code:
arena_group* g;
int r = posix_memalign(&g, 32768, sizeof(arena_group));
assert(r == 0); // posix_memalign succeeded
Given this allocation strategy, write a version of node_find_group
that takes O(1) time.
arena_group* node_find_group(arena* a, node* n) {
3 points
uintptr_t n_addr = (uintptr_t) n;
return (arena_group*) (n_addr - n_addr % 32768);
}
6. IO caching and strace (18 points)
Elif Batuman is investigating several program executables left behind by
her ex-roommate Fyodor. She runs each executable under strace
in the
following way:
strace -o strace.txt ./EXECUTABLE files/text1meg.txt > files/out.txt
Help her figure out properties of these programs based on their system call traces.
QUESTION 6A. Program ./mysterya
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x8193000
brk(0x81b5000) = 0x81b5000
read(3, "A", 1) = 1
write(1, "A", 1) = 1
read(3, "\n", 1) = 1
write(1, "\n", 1) = 1
read(3, "A", 1) = 1
write(1, "A", 1) = 1
read(3, "'", 1) = 1
write(1, "'", 1) = 1
read(3, "s", 1) = 1
write(1, "s", 1) = 1
...
Circle at least one option in each column.
|
|
|
|
3 points. 1, a, i, D
QUESTION 6B. Program ./mysteryb
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x96c5000
brk(0x96e6000) = 0x96e6000
read(3, "A\nA's\nAA's\nAB's\nABM's\nAC's\nACTH'"..., 2048) = 2048
write(1, "A\nA's\nAA's\nAB's\nABM's\nAC's\nACTH'"..., 2048) = 2048
read(3, "kad\nAkron\nAkron's\nAl\nAl's\nAla\nAl"..., 2048) = 2048
write(1, "kad\nAkron\nAkron's\nAl\nAl's\nAla\nAl"..., 2048) = 2048
...
Circle at least one option in each column.
|
|
|
|
3 points. 1, b/c, ii, B
QUESTION 6C. Program ./mysteryc
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x9064000
brk(0x9085000) = 0x9085000
fstat64(3, {st_mode=S_IFREG|0664, st_size=1048576, ...}) = 0
lseek(3, 1046528, SEEK_SET) = 1046528
read(3, "ingau\nRheingau's\nRhenish\nRhianno"..., 2048) = 2048
write(1, "oR\ntlevesooR\ns'yenooR\nyenooR\ns't"..., 2048) = 2048
lseek(3, 1044480, SEEK_SET) = 1044480
read(3, "Quinton\nQuinton's\nQuirinal\nQuisl"..., 2048) = 2048
write(1, "ehR\neehR\naehR\ns'hR\nhR\nsdlonyeR\ns"..., 2048) = 2048
lseek(3, 1042432, SEEK_SET) = 1042432
read(3, "emyslid's\nPrensa\nPrensa's\nPrenti"..., 2048) = 2048
write(1, "\ns'nailitniuQ\nnailitniuQ\nnniuQ\ns"..., 2048) = 2048
lseek(3, 1040384, SEEK_SET) = 1040384
read(3, "Pindar's\nPinkerton\nPinocchio\nPin"..., 2048) = 2048
write(1, "rP\ndilsymerP\ns'regnimerP\nregnime"..., 2048) = 2048
...
Circle at least one option in each column.
|
|
|
|
3 points. 2, c, ii, B
QUESTION 6D. Program ./mysteryd
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x9a0e000
brk(0x9a2f000) = 0x9a2f000
fstat64(3, {st_mode=S_IFREG|0664, st_size=1048576, ...}) = 0
lseek(3, 1048575, SEEK_SET) = 1048575
read(3, "o", 2048) = 1
lseek(3, 1048574, SEEK_SET) = 1048574
read(3, "Ro", 2048) = 2
lseek(3, 1048573, SEEK_SET) = 1048573
read(3, "\nRo", 2048) = 3
...
lseek(3, 1046528, SEEK_SET) = 1046528
read(3, "ingau\nRheingau's\nRhenish\nRhianno"..., 2048) = 2048
write(1, "oR\ntlevesooR\ns'yenooR\nyenooR\ns't"..., 2048) = 2048
lseek(3, 1046527, SEEK_SET) = 1046527
read(3, "eingau\nRheingau's\nRhenish\nRhiann"..., 2048) = 2048
lseek(3, 1046526, SEEK_SET) = 1046526
read(3, "heingau\nRheingau's\nRhenish\nRhian"..., 2048) = 2048
...
Circle at least one option in each column.
|
|
|
|
3 points. 2, b, ii, B
QUESTION 6E. Program ./mysterye
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x93e5000
brk(0x9407000) = 0x9407000
read(3, "A", 1) = 1
read(3, "\n", 1) = 1
read(3, "A", 1) = 1
...
read(3, "A", 1) = 1
read(3, "l", 1) = 1
write(1, "A\nA's\nAA's\nAB's\nABM's\nAC's\nACTH'"..., 1024) = 1024
read(3, "t", 1) = 1
read(3, "o", 1) = 1
read(3, "n", 1) = 1
...
Circle at least one option in each column.
|
|
|
|
3 points. 1, a, ii, C
Some people will circle C and D, because there’s no read cache, so the read cache is “other.” That’s OK.
QUESTION 6F. Program ./mysteryf
:
open("files/text1meg.txt", O_RDONLY) = 3
brk(0) = 0x9281000
brk(0x92a3000) = 0x92a3000
read(3, "A\nA's\nAA's\nAB's\nABM's\nAC's\nACTH'"..., 4096) = 4096
write(1, "A", 1) = 1
write(1, "\n", 1) = 1
write(1, "A", 1) = 1
...
write(1, "A", 1) = 1
write(1, "l", 1) = 1
read(3, "ton's\nAludra\nAludra's\nAlva\nAlvar"..., 4096) = 4096
write(1, "t", 1) = 1
write(1, "o", 1) = 1
write(1, "n", 1) = 1
...
Circle at least one option in each column.
|
|
|
|
3 points. 1, b/c, i, A
7. Processor cache (15 points)
The following questions use the following C definition for an N
xM
matrix (the matrix has N
rows and M
columns).
typedef struct matrix {
unsigned N;
unsigned M;
double elt[0];
} matrix;
matrix* matrix_create(unsigned N, unsigned M) {
matrix* m = (matrix*) malloc(sizeof(matrix) + N * M * sizeof(double));
m->N = N;
m->M = M;
for (size_t i = 0; i < N * M; ++i)
m->elt[i] = 0.0;
return m;
}
Typically, matrix data is stored in row-major order: element
mij (at row i and column
j) is stored in m->elt[i*m->M + j]
. We might write this in C
using an inline function:
inline double* melt1(matrix* m, unsigned i, unsigned j) {
return &m->elt[i * m->M + j];
}
But that’s not the only possible method to store matrix data. Here are several more.
inline double* melt2(matrix* m, unsigned i, unsigned j) {
return &m->elt[i + j * m->N];
}
inline double* melt3(matrix* m, unsigned i, unsigned j) {
return &m->elt[i + ((m->N - i + j) % m->M) * m->N];
}
inline double* melt4(matrix* m, unsigned i, unsigned j) {
return &m->elt[i + ((i + j) % m->M) * m->N];
}
inline double* melt5(matrix* m, unsigned i, unsigned j) {
assert(m->M % 8 == 0);
unsigned k = (i/8) * (m->M/8) + (j/8);
return &m->elt[k*64 + (i % 8) * 8 + j % 8];
}
QUESTION 7A. Which method (of melt1
–melt5
) will have the best
processor cache behavior if most matrix accesses use loops like this?
for (unsigned j = 0; j < 100; ++j)
for (unsigned i = 0; i < 100; ++i)
f(*melt(m, i, j));
7A-7D are graded together.
- 4/4 correct: 9 points
- 3/4 correct: 8 points
- 2/4 correct: 6 points
- 1/4 correct: 4 points
melt2
QUESTION 7B. Which method will have the best processor cache behavior if most matrix accesses use loops like this?
for (unsigned i = 0; i < 100; ++i)
f(*melt(m, i, i));
melt3
QUESTION 7C. Which method will have the best processor cache behavior if most matrix accesses use loops like this?
for (unsigned i = 0; i < 100; ++i)
for (unsigned j = 0; j < 100; ++j)
f(*melt(m, i, j));
melt1
(but melt5
is almost as good!)
QUESTION 7D. Which method will have the best processor cache behavior if most matrix accesses use loops like this?
for (int di = -3; di <= 3; ++di)
for (int dj = -3; dj <= 3; ++dj)
f(*melt(m, I + di, J + dj));
melt5
QUESTION 7E. Here is a matrix-multiply function in ikj order.
matrix* matrix_multiply(matrix* a, matrix* b) {
assert(a->M == b->N);
matrix* c = matrix_create(a->N, b->M);
for (unsigned i = 0; i != a->N; ++i)
for (unsigned k = 0; k != a->M; ++k)
for (unsigned j = 0; j != b->M; ++j)
*melt(c, i, j) += *melt(a, i, k) * *melt(b, k, j);
}
This loop order is cache-optimal when data is stored in melt1
order.
What loop order is cache-optimal for melt2
?
3 points. jki is best; kji is a close second.
QUESTION 7F. You notice that accessing a matrix element using
melt1
is very slow. After some debugging, it seems like the processor
on which you are running code has a very slow multiply instruction.
Briefly describe a change to struct matrix
that would let you write a
version of melt1
with no multiply instruction. You may add members,
change sizes, or anything you like.
3 points. Example answers:
- Add a
double\*\* rows
member that points to each row so you don't need to multiply - Round M up to a power of 2 and use shifts
NOTOC