# Section 7: Kernel exercises

In today’s section, we’ll walk through a selection of our kernel exercises, which are problems taken from prior exams. Also take the opportunity to ask questions about the problem set and class material.

All exercises assume a 64-bit x86-64 architecture unless otherwise stated.

The main subjects of these exercises are:

## KERN-1. Virtual memory

QUESTION KERN-1A. What is the x86-64 page size? Circle all that apply.

1. 4096 bytes
2. 64 cache lines
3. 256 words
4. 0x1000 bytes
5. 216 bits
6. None of the above

The following questions concern the sizes of page tables. Answer the questions in units of pages. For instance, the page tables in WeensyOS each contained one level-4 page table page (the highest level, corresponding to address bits 39-47); one level-3 page table page; one level-2 page table page; and two level-1 page table pages, for a total size of 5 pages per page table.

QUESTION KERN-1B. What is the maximum size (in pages) of an x86-64 page table (page tables only, not destination pages)? You may write an expression rather than a number.

QUESTION KERN-1C. What is the minimum size (in pages) of an x86-64 page table that would allow a process to access 221 distinct physical addresses?

The 64-bit x86-64 architecture is an extension of the 32-bit x86 architecture, which used 32-bit virtual addresses and 32-bit physical addresses. But before 64 bits came along, Intel extended 32-bit x86 in a more limited way called Physical Address Extension (PAE). Here’s how they differ.

• PAE allows 32-bit machines to access up to 252 bytes of physical memory (which is about 4000000 GB). That is, virtual addresses are 32 bits, and physical addresses are 52 bits.
• The x86-64 architecture evolves the x86 architecture to a 64-bit word size. x86-64 pointers are 64 bits wide instead of 32. However, only 48 of those bits are meaningful: the upper 16 bits of each virtual address are ignored. Thus, virtual addresses are 48 bits. As with PAE, physical addresses are 52 bits.

QUESTION KERN-1D. Which of these two machines would support a higher number of concurrent processes without performance problems?

1. x86-32 with PAE with 100 GB of physical memory.
2. x86-64 with 20 GB of physical memory.

QUESTION KERN-1E. Which of these two machines would support a higher maximum number of threads per process?

1. x86-32 with PAE with 100 GB of physical memory.
2. x86-64 with 20 GB of physical memory.

## KERN-3. Kernel programming

WeensyOS processes are quite isolated: the only way they can communicate is by using the console. Let’s design some system calls that will allow processes to explicitly share pages of memory. Then the processes can communicate by writing and reading the shared memory region. Here are two new WeensyOS system calls that allow minimal page sharing; they return 0 on success and –1 on error.

• int share(pid_t p, void* addr)
Allow process p to access the page at address addr.
• int attach(pid_t p, void* remote_addr, void* local_addr)
Access the page in process p’s address space at address remote_addr. That physical page is added to the calling process’s address space at address local_addr, replacing any page that was previously mapped there. It is an error if p has not shared the page at remote_addr with the calling process.

Here’s an initial implementation of these system calls, written as clauses in the WeensyOS kernel’s syscall function.

case SYSCALL_SHARE: {
pid_t p = current->regs.reg_rdi;

/* [A] */

int shindex = current->nshared;
if (shindex >= MAX_NSHARED) {
return -1;
}

/* [B] */

++current->nshared;
current->shared[shindex].sh_partner = p;
return 0;
}

case SYSCALL_ATTACH: {
pid_t p = current->regs.reg_rdi;

/* [C] */

int shindex = -1;
for (int i = 0; i < processes[p].nshared; ++i) {
&& processes[p].shared[i].sh_partner == current->p_pid) {
shindex = i;
}
}
if (shindex == -1) {
return -1;
}

/* [D] */

/* [E] */

vmiter(current->pagetable, local_addr).map(it.pa(), PTE_P | PTE_W | PTE_U);

/* [F] */

return 0;
}


Some notes:

• The implementation stores sharing records in an array. A process may call share successfully at most MAX_NSHARED times. After that, its future share calls will return an error.
• processes[p].nshared is initialized to 0 for all processes.
• Assume that WeensyOS has been implemented as in Problem Set 3 up through step 7 (shared read-only memory).

QUESTION KERN-3A. True or false: Given this implementation, a single WeensyOS process can cause the kernel to crash simply by calling share one or more times (with no process ever calling attach). If true, give an example of a call or calls that would likely crash the kernel.

QUESTION KERN-3B. True or false: Given this implementation, a single WeensyOS process can cause the kernel to crash simply by calling attach one or more times (with no process ever calling share). If true, give an example of a call or calls that would likely crash the kernel.

QUESTION KERN-3C. True or false: Given this implementation, WeensyOS processes 2 and 3 could work together to obtain write access to the kernel code located at address KERNEL_START_ADDR. If true, give an example of calls that would obtain this access.

QUESTION KERN-3D. True or false: Given this implementation, WeensyOS processes 2 and 3 could work together to obtain write access to any memory, without crashing or modifying kernel code or data. If true, give an example of calls that would obtain access to a page mapped at address 0x110000 in process 5.

QUESTION KERN-3E. True or false: Given this implementation, WeensyOS child processes 2 and 3 could work together to modify the code run by a their shared parent, process 1, without crashing or modifying kernel code or data. If true, give an example of calls that would obtain write access to process 1’s code, which is mapped at address PROC_START_ADDR.

QUESTION KERN-3F. Every “true” answer to the preceding questions is a bug in WeensyOS’s process isolation. Fix these bugs. Write code snippets that address these problems, and say where they go in the WeensyOS code (for instance, you could refer to bracketed letters to place your snippets); or for partial credit describe what your code should do.

## KERN-8. More virtual memory

QUESTION KERN-8A. What kind of address is stored in x86-64 register %cr3, virtual or physical?

QUESTION KERN-8B. What kind of address is stored in x86-64 register %rip, virtual or physical?

QUESTION KERN-8C. What kind of address is stored in an x86-64 page table entry, virtual or physical?

QUESTION KERN-8D. What is the x86-64 word size in bits?

Many paged-virtual-memory architectures can be characterized in terms of the PLX constants:

• P = the length of the page offset, in bits.
• L = the number of different page indexes (equivalently, the number of page table levels).
• X = the length of each page index, in bits.

QUESTION KERN-8E. What are the numeric values for P, L, and X for x86-64?

Assume for the remaining parts that, as in x86-64, each page table page fits within a single page, and each page table entry holds an address and some flags, including a Present flag.

QUESTION KERN-8F. Write a PLX formula for the number of bytes per page, using both mathematical and C notation.

Mathematical notation: _____________________

C notation: _____________________________

QUESTION KERN-8G. Write a PLX formula for the number of meaningful bits in a virtual address.

QUESTION KERN-8H. Write a PLX formula that is an upper bound on the number of bits in a physical address. (Your upper bound should be relatively tight; PX100L is a bad answer.)

QUESTION KERN-8I. Write a PLX formula for the minimum number of pages it would take to store a page table that allows access to 2X distinct destination physical pages.

## KERN-14. A sample page table

QUESTION KERN-14A. Describe contents of x86-64 physical memory that would ensure virtual addresses 0x10'0ff3 through 0x10'11f2 map to a contiguous range of 512 physical addresses 0x9'9999'9ff3 through 0x9'9999'a1f2. We’ve given you the contents of 2 words of physical memory; you should list more. Assume and/or recall:

• The %cr3 register holds the value 0x20000.
• You may change any part of memory.
• (PTE_P | PTE_U | PTE_W) == 0x7.