In today’s section, we’ll walk through a selection of our kernel exercises, which are problems taken from prior exams. Also take the opportunity to ask questions about the problem set and class material.
All exercises assume a 64-bit x86-64 architecture unless otherwise stated.
The main subjects of these exercises are:
- KERN-1: Virtual memory.
- KERN-3: Kernel programming.
- KERN-8: More virtual memory.
- KERN-14: A sample page table.
KERN-1. Virtual memory
QUESTION KERN-1A. What is the x86-64 page size? Circle all that apply.
- 4096 bytes
- 64 cache lines
- 256 words
0x1000
bytes- 216 bits
- None of the above
The following questions concern the sizes of page tables. Answer the questions in units of pages. For instance, the page tables in WeensyOS each contained one level-4 page table page (the highest level, corresponding to address bits 39-47); one level-3 page table page; one level-2 page table page; and two level-1 page table pages, for a total size of 5 pages per page table.
QUESTION KERN-1B. What is the maximum size (in pages) of an x86-64 page table (page tables only, not destination pages)? You may write an expression rather than a number.
QUESTION KERN-1C. What is the minimum size (in pages) of an x86-64 page table that would allow a process to access 221 distinct physical addresses?
The 64-bit x86-64 architecture is an extension of the 32-bit x86 architecture, which used 32-bit virtual addresses and 32-bit physical addresses. But before 64 bits came along, Intel extended 32-bit x86 in a more limited way called Physical Address Extension (PAE). Here’s how they differ.
- PAE allows 32-bit machines to access up to 252 bytes of physical memory (which is about 4000000 GB). That is, virtual addresses are 32 bits, and physical addresses are 52 bits.
- The x86-64 architecture evolves the x86 architecture to a 64-bit word size. x86-64 pointers are 64 bits wide instead of 32. However, only 48 of those bits are meaningful: the upper 16 bits of each virtual address are ignored. Thus, virtual addresses are 48 bits. As with PAE, physical addresses are 52 bits.
QUESTION KERN-1D. Which of these two machines would support a higher number of concurrent processes without performance problems?
- x86-32 with PAE with 100 GB of physical memory.
- x86-64 with 20 GB of physical memory.
QUESTION KERN-1E. Which of these two machines would support a higher maximum number of threads per process?
- x86-32 with PAE with 100 GB of physical memory.
- x86-64 with 20 GB of physical memory.
KERN-3. Kernel programming
WeensyOS processes are quite isolated: the only way they can communicate is by using the console. Let’s design some system calls that will allow processes to explicitly share pages of memory. Then the processes can communicate by writing and reading the shared memory region. Here are two new WeensyOS system calls that allow minimal page sharing; they return 0 on success and –1 on error.
int share(pid_t p, void* addr)
Allow processp
to access the page at addressaddr
.int attach(pid_t p, void* remote_addr, void* local_addr)
Access the page in processp
’s address space at addressremote_addr
. That physical page is added to the calling process’s address space at addresslocal_addr
, replacing any page that was previously mapped there. It is an error ifp
has not shared the page atremote_addr
with the calling process.
Here’s an initial implementation of these system calls, written as clauses in
the WeensyOS kernel’s syscall
function.
case SYSCALL_SHARE: {
pid_t p = current->regs.reg_rdi;
uintptr_t addr = current->regs.reg_rsi;
/* [A] */
int shindex = current->nshared;
if (shindex >= MAX_NSHARED) {
return -1;
}
/* [B] */
++current->nshared;
current->shared[shindex].sh_addr = addr;
current->shared[shindex].sh_partner = p;
return 0;
}
case SYSCALL_ATTACH: {
pid_t p = current->regs.reg_rdi;
uintptr_t remote_addr = current->regs.reg_rsi;
uintptr_t local_addr = current->regs.reg_rdx;
/* [C] */
int shindex = -1;
for (int i = 0; i < processes[p].nshared; ++i) {
if (processes[p].shared[i].sh_addr == remote_addr
&& processes[p].shared[i].sh_partner == current->p_pid) {
shindex = i;
}
}
if (shindex == -1) {
return -1;
}
/* [D] */
vmiter it(processes[p].pagetable, remote_addr);
/* [E] */
vmiter(current->pagetable, local_addr).map(it.pa(), PTE_P | PTE_W | PTE_U);
/* [F] */
return 0;
}
Some notes:
- The implementation stores sharing records in an array. A process may
call
share
successfully at mostMAX_NSHARED
times. After that, its futureshare
calls will return an error. processes[p].nshared
is initialized to 0 for all processes.- Assume that WeensyOS has been implemented as in Problem Set 3 up through step 7 (shared read-only memory).
QUESTION KERN-3A. True or false: Given this implementation, a single
WeensyOS process can cause the kernel to crash simply by calling share
one or more times (with no process ever calling attach
). If true, give
an example of a call or calls that would likely crash the kernel.
QUESTION KERN-3B. True or false: Given this implementation, a single
WeensyOS process can cause the kernel to crash simply by calling
attach
one or more times (with no process ever calling share
). If
true, give an example of a call or calls that would likely crash the
kernel.
QUESTION KERN-3C. True or false: Given this implementation, WeensyOS
processes 2 and 3 could work together to obtain write access to the
kernel code located at address KERNEL_START_ADDR
. If true, give an
example of calls that would obtain this access.
QUESTION KERN-3D. True or false: Given this implementation, WeensyOS processes 2 and 3 could work together to obtain write access to any memory, without crashing or modifying kernel code or data. If true, give an example of calls that would obtain access to a page mapped at address 0x110000 in process 5.
QUESTION KERN-3E. True or false: Given this implementation, WeensyOS
child processes 2 and 3 could work together to modify the code run by a
their shared parent, process 1, without crashing or modifying kernel
code or data. If true, give an example of calls that would obtain write
access to process 1’s code, which is mapped at address
PROC_START_ADDR
.
QUESTION KERN-3F. Every “true” answer to the preceding questions is a bug in WeensyOS’s process isolation. Fix these bugs. Write code snippets that address these problems, and say where they go in the WeensyOS code (for instance, you could refer to bracketed letters to place your snippets); or for partial credit describe what your code should do.
KERN-8. More virtual memory
QUESTION KERN-8A. What kind of address is stored in x86-64 register
%cr3
, virtual or physical?
QUESTION KERN-8B. What kind of address is stored in x86-64 register
%rip
, virtual or physical?
QUESTION KERN-8C. What kind of address is stored in an x86-64 page table entry, virtual or physical?
QUESTION KERN-8D. What is the x86-64 word size in bits?
Many paged-virtual-memory architectures can be characterized in terms of the PLX constants:
- P = the length of the page offset, in bits.
- L = the number of different page indexes (equivalently, the number of page table levels).
- X = the length of each page index, in bits.
QUESTION KERN-8E. What are the numeric values for P, L, and X for x86-64?
Assume for the remaining parts that, as in x86-64, each page table page fits within a single page, and each page table entry holds an address and some flags, including a Present flag.
QUESTION KERN-8F. Write a PLX formula for the number of bytes per page, using both mathematical and C notation.
Mathematical notation: _____________________
C notation: _____________________________
QUESTION KERN-8G. Write a PLX formula for the number of meaningful bits in a virtual address.
QUESTION KERN-8H. Write a PLX formula that is an upper bound on the number of bits in a physical address. (Your upper bound should be relatively tight; PX100L is a bad answer.)
QUESTION KERN-8I. Write a PLX formula for the minimum number of pages it would take to store a page table that allows access to 2X distinct destination physical pages.
KERN-14. A sample page table
QUESTION KERN-14A. Describe contents of x86-64 physical memory that would ensure virtual addresses 0x10'0ff3 through 0x10'11f2 map to a contiguous range of 512 physical addresses 0x9'9999'9ff3 through 0x9'9999'a1f2. We’ve given you the contents of 2 words of physical memory; you should list more. Assume and/or recall:
- The
%cr3
register holds the value 0x20000. - You may change any part of memory.
(PTE_P | PTE_U | PTE_W) == 0x7
.
Physical address | ⟶ | Content |
---|---|---|
0x1f000 | ⟶ | 0x202007 |
0x20000 | ⟶ | 0x1f007 |