Process isolation and virtual memory
We don’t want processes to be able to trample over the memory of other processes, as discussed in class. So, we need process isolation. Virtual memory is a mechanism for the OS to provide process isolation. Today’s section will be focused on diving deeper into virtual memory.
When you print a pointer in C program, you see the pointer’s virtual address in that process’s address space. The actual place where the underlying bytes live is in physical memory, at an address only known to the OS. This address can be determined using the process’s page table (which maps virtual addresses to physical addresses). This page table is set up by the operating system kernel, and interpreted by the processor hardware.
Multi-level page tables
Many processors, including x86-64, use multi-level page tables to
translate virtual addresses to physical addresses. A multi-level page table is
structured in a tree format. The root of the tree is configured via a register
(%cr3
on x86-64); the root’s entries point to lower-level page tables. The
virtual address being looked up provides the index of the entry to follow at
each level of the pagetable. We are going to walk through how the pagetable
structure can be determined from the page size and address size.
Example: x86-32
On 32-bit x86 architecture (virtual addresses are 32 bits long), each page is 4096 = 212 bytes. Each physical page contains 212 separate addresses, and the architecture supports up to 220 = 232 / 212 virtual pages.
QUESTION: What is the size of the offset for an x86-32 address?
12 bits.
QUESTION: How many 32-bit addresses can fit in an x86-32 page?
212 / 4 = 210 = 1024.
QUESTION: Assume that x86-32 used a single-level page table (like the page table from the Kernel 2 lecture with 6-bit addresses). Such a page table is indexed by the top portion of the address—everything but the offset. That means it must occupy enough contiguous physical memory to account for every possible index. How big would the single-level page table be?
232 addresses / 212 addresses/page * 22 bytes/page table entry = 222 bytes of page table.
x86-32 does not use a single-level page table. It uses a two-level page table. The x86-32 address divides into three sections: level-2 index, level-1 index, and offset.
bits 31-22 | bits 21-12 | bits 11-0 |
---|---|---|
Level 2 index | Level 1 index | Offset |
The address of the level-2 page table is stored in the x86-32 %cr3
register.
This page table is indexed by level 2 index to find the physical address of
a level-1 page table; there can be many of these, one per level-2 index. The
level-1 page table is indexed by level 1 index to find the physical address of
the final physical page.
QUESTION: Why is 10 bits per level the correct number?
10 bits = 210 entries. 210 entries * 4 bytes/entry = 212 bytes = the number of bytes on a page. This division means that page tables naturally break into units of single pages, which makes page management convenient.
QUESTION: What is the minimum number of page table pages necessary to provide access to 220 different bytes of physical memory?
220 bytes = 28 physical pages. A single level-1 page table can point to 210 different pages, so we only need one level-1 page table page. We always need a root: the level-2 page table page. So the answer would seem to be 2.
But if we want to be very clever, we can actually reuse the level-2 page table page as a level-1 page table page! So the true answer is 1.
For instance, consider a level-2 page table page, located at physical address 0x1000, whose index-0 entry contained physical address 0x1000!
(This kind of reuse is occasionally useful for special purposes, but honestly it’s more of a party trick.)
QUESTION: What is the minimum number of page tables necessary to provide access to 220 different pages of physical memory?
220 pages requires 210 different level-1 page tables. So a total of 210 + 1 pages are required.
Moving to x86-64
x86-64 uses a similar structure, but each address consists of 48 meaningful bits (stored in a 64-bit, or 8-byte, number). This is so many bits that more levels of page table are required. x86-64 has a four-level page table, with addresses divided as follows:
bits 63-48 | bits 47-39 | bits 38-30 | bits 29-21 | bits 20-12 | bits 11-0 |
---|---|---|---|---|---|
(ignored) | Level 4 index | Level 3 index | Level 2 index | Level 1 index | Offset |
Try working through the above problems for x86-64 (probably on your own time). What numbers do you get?
QUESTION: What is the maximum number of page table pages in each level? What is the maximum number of physical pages that are accessible with 4 levels of the pagetable? Include both physical pages corresponding to virtual memory and pagetable pages.
There is 1 level-4 pagetable. There are at most 512 = 2^9 level-3 pagetable pages. There are most 512 * 512 = 2^18 level-2 page table pages. There are at most 512 * 512 * 512 = 2^27 level-1 pagetable pages. Each level-1 pagetable entry points to a physical page, so there are at most 512 * 512 * 512 * 512 = 2^36 physical pages = 2^48 bytes pointed to by virtual memory. There are also 1 + 2^9 + 2^18 + 2^27 pagetable pages.
QUESTION: If the upper 16 bits of x86-64 addresses were meaningful, how many levels would the page table require?
Six: there 16 more bits, and a level can hold at most 9 bits’ worth of index.
QUESTION: What is the minimum number of physical pages required on x86-64 to allocate the following allocations? Draw an example pagetable mapping for each scenario (start from scratch each time).
- 1 byte of memory
- 1 allocation of size 2^12 bytes of memory
- 2^9 allocations of size of 2^12 bytes of memory each
- 2^9 + 1 allocations of size of 2^12 bytes of memory each
- 2^18 + 1 allocations of size 2^12 bytes of memory each
- 5 pages (1 per level plus the destination physical page)
- Same as part (a)
- 2^9 + 4 pages: 1 L4, 1 L3, 1 L2, and 1 L1 pages, plus 2^9 physical pages
- 2^9 + 6 pages: 1 L4, 1 L3, 1 L2, 2 L1 pages, plus 2^9 + 1 physical pages
- 2^18 + 518 pages: 1 L4, 1 L3, 2 L2, 513 L1 pages plus 2^18 + 1 physical pages
WeensyOS
The pset for this unit asks you to add to WeensyOS, an operating system that we’ll use in this course. WeensyOS has internal interfaces for dealing with virtual memory that we’ll now discuss.
vmiter
and ptiter
are iterators in WeensyOS for processing page tables.
Vmiter helps iterate through pages in virtual memory, and lets you check the
underlying physical pages for properties like permissions and physical
addresses. Ptiter helps you iterate through page table pages. Let’s walk
through the code:
Vmiter
The vmiter
class represents an iterator over virtual memory mappings. Here
is its interface:
// `vmiter` walks over virtual address mappings.
// `pa()` and `perm()` read current addresses and permissions;
// `map()` installs new mappings.
class vmiter {
public:
// initialize a `vmiter` for `pt` pointing at `va`
inline vmiter(x86_64_pagetable* pt, uintptr_t va = 0);
inline vmiter(const proc* p, uintptr_t va = 0);
inline uintptr_t va() const; // current virtual address
inline uint64_t pa() const; // current physical address
inline uint64_t perm() const; // current permissions
inline bool present() const; // is va present?
inline bool writable() const; // is va writable?
inline bool user() const; // is va user-accessible (unprivileged)?
inline vmiter& find(uintptr_t va); // change virtual address to `va`
inline vmiter& operator+=(intptr_t delta); // advance `va` by `delta`
// map current va to `pa` with permissions `perm`
// Current va must be page-aligned. Calls kallocpage() to allocate
// page table pages if necessary. Returns 0 on success,
// negative on failure.
int map(uintptr_t pa, int perm = PTE_P | PTE_W | PTE_U)
__attribute__((warn_unused_result));
};
This type parses x86-64 page tables and manages virtual-to-physical address
mappings. vmiter(pt, va)
creates a vmiter
object that’s examining virtual
address va
in page table pt
. Methods on this object can return the
corresponding physical address:
x86_64_pagetable* pt = ...;
uintptr_t pa = vmiter(pt, va).pa(); // returns uintptr_t(-1) if unmapped
Or the permissions:
if (vmiter(pt, va).writable()) {
// then `va` is present and writable (PTE_P | PTE_W) in `pt`
}
It’s also possible to use vmiter
as a loop variable, calling methods
that query its state and methods that change its current virtual address. For
example, this loop prints all present mappings in the lower 64KiB of memory:
for (vmiter it(pt, 0); it.va() < 0x10000; it += PAGESIZE) {
if (it.present()) {
log_printf("%p maps to %p\n", it.va(), it.pa());
}
}
This loop goes one page at a time (the it += PAGESIZE
expression increases
it.va()
by PAGESIZE
).
The vmiter.map()
function is used to add mappings to a page table. This
maps physical page 0x3000 at virtual address 0x2000:
int r = vmiter(pt, 0x2000).map(0x3000, PTE_P | PTE_W | PTE_U);
// r == 0 on success, r < 0 on failure
Ptiter
class ptiter {
public:
// initialize a `ptiter` for `pt` pointing at `va`
inline ptiter(x86_64_pagetable* pt, uintptr_t va = 0);
inline ptiter(const proc* p, uintptr_t va = 0);
inline uintptr_t va() const; // current virtual address
inline uintptr_t last_va() const; // one past last va covered by ptp
inline bool active() const; // does va exist?
inline int level() const; // current level (0-2)
inline x86_64_pagetable* ptp() const; // current page table page
inline uintptr_t ptp_pa() const; // physical address of ptp
// move to next page table page in depth-first order
inline void next();
};
ptiter
visits the individual page table pages in a multi-level page table,
in depth-first order (so all level-1 page tables under a level-2 page table
are visited before the level-2 is visited). A ptiter
loop makes it easy to
find all the page table pages owned by a process, which is usually at least 4
page tables in x86-64 (one per level).
for (ptiter it(pt, 0); it.va() < MEMSIZE_VIRTUAL; it.next()) {
log_printf("[%p, %p): ptp at va %p, pa %p\n",
it.va(), it.end_va(), it.ptp(), it.ptp_pa());
}
A WeensyOS process might print the following:
[0x0, 0x200000): ptp at va 0xb000, pa 0xb000
[0x200000, 0x400000): ptp at va 0xe000, pa 0xe000
[0x0, 0x40000000): ptp at va 0xa000, pa 0xa000
[0x0, 0x8000000000): ptp at va 0x9000, pa 0x9000
Note the depth-first order: the level-1 page table pages are visited first,
then level-2, then level-3. Because of this order (and other implementation
choices), a ptiter
loop may be used to free a page table—for instance, when
you’re implementing process exit later :)
ptiter
never visits the top level-4 page table page, so that will need to be
freed independently of the ptiter
loop.
Using iterators
On this problem set, you will implement some of the
system calls in WeensyOS. Two of the main system calls that you will
implement are fork()
and exit()
.
The fork system call allows a process to spawn a child process that is essentially a copy of the parent process. The child process has a copy of the parent processes’ memory, and after the new child process created, both processes execute the next instruction after fork(). The registers and file descriptors are also copied.
The exit system call allows a process to stop its execution. All of the memory belonging to that process, including its page table pages, are returned to the kernel for reuse.
The Lifecycle of a Process
A process: the singular unit of life in an operating system. When a new process is born, it needs its own page table pages, which it gets from its parent. For processes not born from a
fork()
call, this parent is the kernel, and thus, the process's page table pages will be a copy of the kernel's.During the normal execution of a process, it refers to memory addresses by their virtual memory addresses -- which means that, to actually access the underlying memory, those virtual memory addresses need to be translated into physical addresses. Fortunately, you have just learned about the basic mechanism behind this process. ;)
Once a process has completed its run, there is some cleanup to be done -- namely, the relinquishment of its virtual memory mappings. First, the process needs to disassociate itself from the kernel's page table pages -- an
exit
ed process shouldn't need them anymore, anyway. Next, the process needs to tell the kernel that it is no longer using its page table pages -- this is not just the level-1 page table pages, but also the higher-level ones as well! Finally, the process can rest in peace. RIP.
QUESTION: Use vmiter
to implement the following function.
void copy_mappings(x86_64_pagetable* dst, x86_64_pagetable* src) {
// Copy all virtual memory mappings from `src` into `dst`
// for addresses in the range [0, MEMSIZE_VIRTUAL).
// You may assume that `dst` starts out empty (has no mappings).
// After this function completes, for any va with
// 0 <= va < MEMSIZE_VIRTUAL, dst and src should map that va
// to the same pa with the same permissions.
}
QUESTION: Use vmiter
and ptiter
to implement the following function.
void free_everything(x86_64_pagetable* pt) {
// Free the following pages by passing their physical addresses
// to `kfree_pa()`:
// 1. All memory accessible by unprivileged mappings in `pt`.
// 2. All page table pages that are part of `pt`.
}