Sample midterm solutions
First try to answer the questions yourself!
1. Sizes and alignments
QUESTION 1A. True or false: For any type X, the size of X
(sizeof(X)
) is greater than or equal to the alignment of type X.
True
QUESTION 1B. True or false: For any type X, the size of
struct Y { X a; char newc; }
is greater than the size of X.
True
QUESTION 1C. True or false: For any types A1
...An
(with n
≥
1), the size of struct Y
is greater than the size of struct X
,
given:
struct X { struct Y {
A1 a1; A1 a1;
... ...
An an; An an;
}; char newc;
};
False (example: A1 = int
, A2 = char
)
QUESTION 1D. True or false: For any types A1
...An
(with n
≥
1), the size of struct Y
is greater than the size of union X
, given:
union X { struct Y {
A1 a1; A1 a1;
... ...
An an; An an;
}; };
False (if n = 1
)
2. Expressions
3. Hello binary
This problem locates 8-bit numbers horizontally and vertically in the following 16x16 image. Black pixels represent 1 bits and white pixels represent 0 bits. For horizontal arrangements, the most significant bit is on the left as usual. For vertical arrangements, the most significant bit is on top.
Examples: The 8-bit number 15 (hexadecimal 0x0F, binary 0b00001111) is located horizontally at 3,4, which means X=3, Y=4.
- The pixel at 3,4 is white, which has bit value 0.
- 4,4 is white, also 0.
- 5,4 is white, also 0.
- 6,4 is white, also 0.
- 7,4 is black, which has bit value 1.
- 8,4, 9,4, and 10,4 are black, giving three more 1s.
- Reading them all off, this is 0b00001111, or 15.
15 is also located horizontally at 7,6.
The 8-bit number 0 is located vertically at 0,0. It is also located horizontally at 0,0 and 1,0.
The 8-bit number 134 (hexadecimal 0x86, binary 0b10000110) is located vertically at 8,4.
QUESTION 3A. Where is 3 located vertically? (All questions refer to 8-bit numbers.)
9,6
QUESTION 3B. Where is 12 located horizontally?
5,5
QUESTION 3C. Where is 255 located vertically?
14,3
4. Hello memory
Shintaro Tsuji wants to represent this image in computer memory. He stores it in an array of 16-bit unsigned integers:
uint16_t cute[16];
Row Y of the image is stored in integer cute[Y]
.
QUESTION 4A. What is sizeof(cute)
, 2, 16, 32, or 64?
32
QUESTION 4B. printf("%d\n", cute[0]);
prints 16384
. Is
Shintaro’s machine big-endian or little-endian?
Little-endian
5. Hello program
6. Memory regions
Consider the following program:
struct ptrs {
int **x;
int *y;
};
struct ptrs global;
void setup(struct ptrs *p) {
int *a = malloc(sizeof(int));
int *b = malloc(sizeof(int));
int *c = malloc(sizeof(int));
int *d = malloc(sizeof(int));
int *e = malloc(sizeof(int) * 2);
int **f = malloc(4 * sizeof(int *));
int **g = malloc(sizeof(int *));
*a = 0;
*b = 0;
*c = (int) a;
*d = *b;
e[0] = 29;
e[1] = (int) &d[100000];
f[0] = b;
f[1] = c;
f[2] = 0;
f[3] = 0;
*g = c;
global.x = f;
global.y = e;
p->x = g;
p->y = &e[1];
}
int main(int argc, char **argv) {
stack_bottom = (char *) &argc;
struct ptrs p;
setup(&p);
m61_collect();
do_stuff(&p);
}
This program allocates objects a
through g
on the heap and then
stores those pointers in some stack and global variables. (It then calls
our conservative garbage collector from class, but that won’t matter
until the next problem.) We recommend you draw a picture of the state
setup
creates.
QUESTION 6A. Assume that (uintptr_t) a == 0x8300000
, and that
malloc
returns increasing addresses. Match each address to the most
likely expression with that address value. The expressions are evaluated
within the context of main
. You will not reuse an expression.
Value | Expression | |||
---|---|---|---|---|
1. | 0x8300040 | A. | &p |
|
2. | 0x8049894 | B. | (int *) *p.x[0] |
|
3. | 0x8361AF0 | C. | &global.y |
|
4. | 0x8300000 | D. | global.y |
|
5. | 0xBFAE0CD8 | E. | (int *) *p.y |
1—D; 2—C; 3—E; 4—B; 5—A
Since p
has automatic storage duration, it is located on the stack,
giving us 5—A. The global
variable has static storage duration, and so
does its component global.y
; so the pointer &global.y
has an address
that is below all heap-allocated pointers. This gives us 2—C. The
remaining expressions go like this:
global.y == e |
p.y == &e[1] , so *p.y == e[1] == (int) &d[100000] , and (int *) *p.y == &d[100000] |
p.x == g , so p.x[0] == g[0] == *g == c , and *p.x[0] == *c == (int) a |
Address #4 has value 0x8300000, which by assumption is a
’s address;
so 4—B. Address #3 is much larger than the other heap addresses, so
3—E. This leaves 1—D.
7. Garbage collection
Here is the top-level function for the conservative garbage collector we
wrote in class (cs61-lectures/l07/m61-13.c
).
void m61_collect(void) {
char *stack_top = (char *) &stack_top;
// The entire contents of the heap start out unmarked
for (size_t i = 0; i != nmr; ++i)
mr[i].marked = 0;
// Mark all reachable objects, starting with the roots (the stack)
m61_markaccessible(stack_top, stack_bottom - stack_top);
// Free everything that wasn't marked
for (size_t i = 0; i != nmr; ++i)
if (mr[i].marked == 0) {
m61_free(mr[i].ptr);
--i; // m61_free moved different data into this
// slot, so we must recheck the slot
}
}
This garbage collector is not correct because it doesn’t capture all memory roots.
Consider the program from the previous section, and assume that an
object is reachable if do_stuff
can access an address within the
object via variable references and memory dereferences without casts or
pointer arithmetic. Then:
QUESTION 7A. Which reachable objects will m61_collect()
free?
Circle all that apply.
a |
b |
c |
d |
e |
f |
g |
None of these |
b
, f
.
The collector searches the stack for roots. This yields just the values
in struct ptrs p
(the only pointer-containing variable with automatic
storage duration at the time m61_collect
is called). The objects
directly pointed to by p
are g
and e
. The collector then
recursively marks objects pointed to by these objects. From g
, it
finds c
. From e
, it finds nothing. Then it checks one more time.
From c
, it finds the value of a
! Now, a
is actually not a pointer
here—the type of *c
is int
—so by the definition above, a
is not
actually reachable. But the collector doesn’t know this.
Putting it together, the collector marks a
, c
, e
, and g
. It
won’t free these objects; it will free the others (b
, d
, and f
).
But b
and f
are reachable from global
.
QUESTION 7B. Which unreachable objects will m61_collect()
not
free? Circle all that apply.
a |
b |
c |
d |
e |
f |
g |
None of these |
a
QUESTION 7C. Conservative garbage collection in C is often slower than precise garbage collection in languages such as Java. Why? Circle all that apply.
- C is generally slower than other languages.
- Conservative garbage collectors must search all reachable memory for pointers. Precise garbage collectors can ignore values that do not contain pointers, such as large character buffers.
- C programs generally use the heap more than programs in other languages.
- None of the above.
#2
8. Data structures
9. Attack
QUESTION 9A. True or false: A successful stack smashing attack
against a process can cause that process to execute a cli
instruction.
True
QUESTION 9B. True or false: Buffer overflow attacks would be neutralized if the processor protected stack return addresses so that they couldn’t be overwritten.
False (there are heap overflow attacks)
QUESTION 9C. “Canary” values can help detect stack smashing attacks before the attacker’s code gains control. Before returning, a function compares one copy of the canary with a reserved copy; if they differ, the stack was smashed. Which properties are valuable in canary values? Circle all that apply.
- Value predictable in advance
- One copy located close to the return address on the stack
- Both copies located close to the return address on the stack
- Inexpensive to compare
- None of the above
#2 and #4
10. Memory errors
The following function constructs and returns a lower-triangular matrix of size N. The elements are random 2-dimensional points in the unit square. The matrix is represented as an array of pointers to arrays.
typedef struct point2 {
double d[2];
} point2;
typedef point2 *point2_vector;
point2_vector *make_random_lt_matrix(size_t N) {
point2_vector *m = (point2_vector *) malloc(sizeof(point2_vector) * N);
for (size_t i = 0; i < N; ++i) {
m[i] = (point2 *) malloc(sizeof(point2) * (i + 1)); /* LINE A */
for (size_t j = 0; j <= i; ++j)
for (int d = 0; d < 2; ++d)
m[i][j].d[d] = drand48(); /* LINE B */
}
return m;
}
This code is running on an x86 machine (size_t
is 32 bits). You may
assume that the machine has enough free physical memory and the process
has enough available virtual address space to satisfy any malloc()
call.
QUESTION 10A. Give a value of N so that, while
make_random_lt_matrix(N)
is running, no malloc()
returns NULL
, but
a memory error (such as a null pointer dereference or an out-of-bounds
dereference) happens on Line A. The memory error should happen
specifically when i == 1
.
(This problem is probably easier when you write your answer in hexadecimal.)
We are asked to produce a value of N so that no memory error happens
on Line A when i == 0
, but a memory error does happen when i == 1
.
So reason that through. What memory errors could happen on Line A if
malloc()
returns non-NULL
? There’s only one memory operation, namely
the dereference m[i]
. Perhaps this dereference is out of bounds.
If no memory error happens when i == 0
, then a m[0]
dereference must
not cause a memory error. So the m
object must contain at least 4
bytes. But a memory error does happen on Line A when i == 1
. So the
m
object must contain less than 8 bytes. How many bytes were
allocated for m
?
sizeof(point2_vector) * N == sizeof(point2 *) * N == 4 * N
. So we
have:
(4 * N)
≥ 4(4 * N)
< 8
It seems like the only possible answer is N == 1
. But no, this doesn’t
cause a memory error, because the loop body would never be executed with
i == 1
!
The key insight is that the multiplications above use 32-bit unsigned
computer arithmetic. Let’s write N
as X + 1
. Then these inequalities
become:
- 4 ≤ (
4 * (X + 1)
=4 * X + 4
) < 8 - 0 ≤
(4 * X)
< 4
(Multiplication distributes over addition in computer arithmetic.) What
values of X
satisfy this inequality? It might be easier to see if we
remember that multiplication by powers of two is equivalent to shifting:
- 0 ≤
(X << 2)
< 4
The key insight is that this shift eliminates the top two bits of X
.
There are exactly four values for X
that work: 0
, 0x40000000
,
0x80000000
, and 0xC0000000
. For any of these, 4 * X == 0
in 32-bit
computer arithmetic, because 4×X
= 0 (mod 232) in normal
arithmetic.
Plugging X
back in to N
, we see that
N ∈ {0x40000001, 0x80000001, 0xC0000001}
. These are the only values
that work.
Partial credit was awarded for values that acknowledged the possibility of overflow.
QUESTION 10B. Give a value of N so that no malloc()
returns
NULL
, and no memory error happens on Line A, but a memory error does
happen on Line B.
If no memory error happens on Line A, then N
< 230
(otherwise overflow would happen as seen above). But a memory error does
happen on Line B. Line B dereferences m[i][j]
, for 0 ≤ j
≤ i
; so
how big is m[i]
? It was allocated on Line A with size
sizeof(point2) * (i + 1) == 2 * sizeof(double) * (i + 1) == 16 * (i + 1)
.
If i + 1
≥ 232 / 16 = 228, this multiplication
will overflow. Since i < N
, we can finally reason that any N
greater
than or equal to 228 = 0x10000000
and less than
230 = 0x40000000
will cause the required memory error.
11. Caches and reference strings
QUESTION 11A. True or false: A direct-mapped cache with N or more slots can handle any reference string containing ≤N distinct addresses with no misses except for cold misses.
False. Direct-mapped caches can have conflict misses.
QUESTION 11B. True or false: A fully-associative cache with N or more slots can handle any reference string containing ≤N distinct addresses with no misses except for cold misses.
True
Consider the following 5 reference strings.
Name | String |
---|---|
α | 1 |
β | 1, 2 |
γ | 1, 2, 3, 4, 5 |
δ | 2, 4 |
ε | 5, 2, 4, 2 |
QUESTION 11C. Which of the strings might indicate a sequential access pattern? Circle all that apply.
α | β | γ | δ | ε | None of these |
(α), β, γ
QUESTION 11D. Which of the strings might indicate a strided access pattern with stride >1? Circle all that apply.
α | β | γ | δ | ε | None of these |
(α), δ
One very clever person pointed out that β and γ could also represent large strides: for example, consider a file with 10 bytes accessed with stride 11!
The remaining questions concern concatenated permutations of these five strings. For example, the permutation αβγδε refers to this reference string:
1, 1, 2, 1, 2, 3, 4, 5, 2, 4, 5, 2, 4, 2. |
We pass such permutations through an initially-empty, fully-associative cache with 3 slots, and observe the numbers of hits.
QUESTION 11E. How many cold misses might a permutation observe? Circle all that apply.
0 | 1 | 2 | 3 | 4 | 5 | Some other number |
5. The first time a reference string address is encountered, it must cause a cold miss.
Under LRU eviction, the permutation αβεγδ observes 5 hits as follows. (We annotate each access with “h” for hit or “m” for miss.)
1m; 1h, 2m; 5m, 2h, 4m, 2h; 1m, 2h, 3m, 4m, 5m; 2m, 4h. |
QUESTION 11F. How many hits does this permutation observe under FIFO eviction?
4 hits.
QUESTION 11G. Give a permutation that will observe 8 hits under LRU eviction, which is the maximum for any permutation. There are several possible answers. (Write your answer as a permutation of αβγδε. For partial credit, find a permutation that has 7 hits, etc.)
The following four permutations observe 8 hits under LRU: αβγδε, αβγεδ, βαγδε, βαγεδ. 28 permutations observe 7 hits; 25 observe 6 hits; and 38 observe 5 hits.
QUESTION 11H. Give a permutation that will observe 2 hits under LRU eviction, which is the minimum for any permutation. There is one unique answer. (Write your answer as a permutation of αβγδε. For partial credit, find a permutation that has 3 hits, etc.)
δαεγβ. 4 permutations observe 3 hits and 20 observe 4 hits.
12. Cache coherence
QUESTION 12A. True or false: The buffer cache is coherent with respect to disk data.
True. Though at times the buffer cache may contain newer data than the disk, that’s still OK.
QUESTION 12B. True or false: An x86 processor cache is coherent with respect to primary memory.
True. The reasoning is similar.
QUESTION 12C. True or false: The standard I/O package’s data cache is coherent with respect to file data.
False. Stdio cache data might be either newer or older than file data.