Problem set 3: WeensyOS

In this assignment, you will add new features to a tiny operating system! In particular, you will implement process memory isolation, virtual memory, and some system calls.

You may want to read Chapter 9 of the textbook. Specifically, the 64-bit x86 virtual memory architecture is described in Section 9.7; the PTE_P, PTE_W, and PTE_U bits are shown in Figure 9.23 and discussed in Section 9.7.1.

Get the code

Get our code with

$ git pull; git pull handout main

or, alternately

$ git pull; git pull https://github.com/cs61/cs61-f24-psets.git main

This will merge our Problem Set 3 code with your previous work. If you have any “conflicts” from prior problem sets, resolve them before continuing further. Run git push to save your work back to your personal repository.

You may also create a new cs61-psets repository for this assignment. Don’t forget to enter your repository URL on the grading server.

Running WeensyOS

To launch WeensyOS, open a Docker container with ./cs61-run-docker, then cd pset3; make run. You should see something like this, which shows four related p-allocator processes running in parallel:

Initial WeensyOS state

The image above loops forever; in an actual run on your real computer, the bars will move to the right and stay there. Don’t worry if your image has different numbers of K’s or otherwise has different details. (If your bars run painfully slowly, edit the p-allocator.cc file and reduce the ALLOC_SLOWDOWN constant.)

The WeensyOS documentation has more information on how to build WeensyOS and how to run it in different modes, including modes that output more debugging information. Some highlights:

Initial state

Let’s take a look at the code in p-allocator.cc and understand what it does, and why it results in the above display.

Here are two labeled memory diagrams, showing what the characters mean and how memory is arranged.

Physical memory map 1

Physical memory map 2

The virtual memory display is similar.

Goal

You will implement complete and correct memory isolation for WeensyOS processes. Then you'll implement full virtual memory, which will improve utilization of physical RAM. You'll also implement fork—creating new processes at runtime—and exit—destroying processes at runtime.

We need to provide a lot of support code for this assignment, but the code you write will all go in kernel.cc.

There is no make check functionality for this pset. Instead, you should run your instance of WeensyOS and visually compare it to the images in the pset description.

Notes

Memory system layout

WeensyOS memory system layout is described by several constants.

KERNEL_START_ADDR Start of kernel code.
KERNEL_STACK_TOP Top of kernel stack. The kernel stack is one page long.
CONSOLE_ADDR CGA console memory.
PROC_START_ADDR Start of application code. Applications should not be able to access memory below PROC_START_ADDR, except for the single page at console.
MEMSIZE_PHYSICAL Size of physical memory in bytes. WeensyOS does not support physical addresses ≥ MEMSIZE_PHYSICAL. Equals 0x200000 (2MiB).
MEMSIZE_VIRTUAL Size of virtual memory. WeensyOS does not support virtual addresses ≥ MEMSIZE_VIRTUAL. Equals 0x300000 (3MiB).
PAGESIZE Size of a memory page. Equals 4096 (or, equivalently, 1 << 12).
PAGEOFFMASK Mask for the offset portion of an address. Equals 4095 (PAGESIZE - 1). If (a & PAGEOFFMASK) == 0, then a is page-aligned.

Kernel and process address spaces

WeensyOS begins with the kernel and all processes sharing a single address space. This is defined by the kernel_pagetable page table. kernel_pagetable is initialized to the identity mapping: virtual address X maps to physical address X.

As you work through the pset, you will shift processes to use independent address spaces, where each process can access only a subset of physical memory.

The kernel, though, still needs the ability to access all of physical memory. Therefore, all kernel functions run using the kernel_pagetable page table. Thus, in kernel functions, each virtual address maps to the physical address with the same number. The exception_entry and syscall_entry assembly codes explicitly install kernel_pagetable when they begin, and exception_return and the syscall return path install the process’s page table as they exit.

Each process page table must contain kernel mappings for the kernel stack and for the exception_entry and syscall_entry code paths. [Why? What would the CPU do if the kernel stack, the exception_entry code, and the syscall_entry code were unmapped when a system call or exception occurred?]

Debugging WeensyOS

There are several ways to debug WeensyOS. We recommend:

Understanding memory errors

You may want to skip this section until you have completed the first few parts of the pset.

The WeensyOS memory viewer, which is defined in k-memviewer.cc, checks your memory management data structures and reports any problems it sees. The memusage::symbol_at() function chooses which symbol to display for each page and detects some errors. Here’s what some of those symbols and error messages mean.

The WeensyOS exception handler will also print messages on page faults, which indicate that process memory accesses went wrong.

Phase 1: Kernel isolation

WeensyOS processes could stomp all over the kernel’s memory if they wanted to. Better stop that! Change kernel_start, the kernel initialization function, so that kernel memory is inaccessible to applications—except for the memory holding the CGA console (the single page at CONSOLE_ADDR == 0xB8000).

When you are done, WeensyOS should look like this. In the virtual map, kernel memory is no longer reverse-video, since the user can’t access it. Note the lonely CGA console memory block.

Kernel isolation

Use vmiter to create memory mappings. Start from the vmiter loop in the kernel_start function.

About virtual memory iterators (vmiter)

In addition, make sure that your sys_page_alloc system call preserves kernel isolation: Applications shouldn’t be able to use sys_page_alloc to screw up the kernel. This requires changes to the SYSCALL_PAGE_ALLOC case in syscall. Read the description of sys_page_alloc in u-lib.hh to get a feeling for the possible errors.

Phase 2: Isolated address spaces

Implement process isolation by giving each process its own independent page table. Your OS should look like this:

Per-process isolation

Each process only has permission to access its own pages, which you can tell because only its own pages are shown in reverse video.

How to implement per-process page tables in process_setup:

Note the diagram now has four pages for each process in the kernel area, starting at 0x1000. These are the four-level page tables for each process. (The colored background indicates that these pages contain kernel-private page table data, even though the pages “belong” to the process.) The first page was allocated explicitly in process_setup; the other pages were allocated by vmiter::try_map as the page table was initialized.

One common solution, shown above, leaves addresses above PROC_START_ADDR totally unmapped by default, but other designs work too. As long as a virtual address mapping has no PTE_U bit, its process isolation properties are unchanged. For instance, this solution, in which all mappings are present but accessible only to the kernel, also implements process isolation correctly:

Per-process isolation, alternate

If you create an incorrect page table, WeensyOS might crazily reboot. Don’t panic; see the debugging hints above.

About program images and segments (pgm and seg)

Phase 3: General process address spaces

So far, WeensyOS processes use identity mappings for process memory: a process code, data, stack, or heap page with virtual address X is stored in the physical page with physical address X. This is inflexible and limits utilization. Processes don’t have access to the address mapping (only the kernel does), so it should be fine for a process’s virtual address X to map to a different physical page—the process won’t be able to tell the difference. This will also enable new functionality, like running different processes with similar virtual address spaces.

Change your operating system to allocate all process data, including its code, globals, stack, and heap, using kalloc instead of direct access to the physpages array. This will turn the process page tables from subsets of an identity mapping to a more general mapping, where lots of pages have different virtual and physical addresses.

Here’s how your OS should look after this phase.

Virtual page allocation

This will complicate the code that initializes process code in process_setup. You’ll need to figure out why (hint: which page table is being used in process_setup?) and find a way around it (hint: vmiter or set_pagetable).

Phase 4: Nonsequential physical page allocation

In the handout code, kalloc always chooses the available physical page that has the lowest address. This means consecutive kalloc calls typically return consecutive physical pages. But your code should not depend on this behavior.

Change kalloc to allocate pages nonsequentially. This can be as simple as setting page_increment to 3 (or any larger odd number). The virtual address spaces should work as before, though the physical memory map will look different:

Virtual page allocation

But if you see a panic—like, maybe,

PAGE FAULT on 0xcccccccccccccccc (pid 3, read missing page, rip=0xcccccccccccccccc)!

you have a problem. Go back and think again about how to copy instructions and data from program segments into process memory.

Once you’re confident in your phase 4 solution, you can go back to setting page_increment = 1, but your code should work with any page allocation strategy.

Phase 5: Overlapping address spaces

Now the processes are isolated, which is awesome, but they’re still not taking full advantage of virtual memory. Since they are isolated, they can use the same address ranges without conflicting on the underlying data.

In this phase, change each process’s stack to grow down from address 0x300000 == MEMSIZE_VIRTUAL. Now the processes have enough heap room to use up all of physical memory!

Overlapping address spaces

If there’s no physical memory available, sys_page_alloc should return an error code to the calling process, such as -1. Do not kill the calling process! Lack of memory is a potentially recoverable problem.

Phase 6: Fork

The fork system call starts a new process as a copy of an existing process. The process that calls fork is called the parent process, and the newly created copy is called the child process. The system call appears to return twice: it returns 0 to the child, and returns the child’s process ID to the parent.

Run WeensyOS with make run-fork. Alternately, at any time, press the ‘f’ key; this will soft-reboot WeensyOS and ask it to run a single p-fork process, rather than the gang of allocators. You should see something like this:

Initial fork state

This is because you haven’t implemented fork yet.

Implement fork.

When you’re done, you should see something like this after make run-fork (or pressing ‘f’).

Fork works

An image like this means you forgot to copy the data for some pages, so the processes are actually sharing stack and/or data pages:

Incorrect fork

Phase 7: Freeing memory

So far, none of your test programs have ever freed memory or exited. Memory allocation’s pretty easy when there’s no free! So let’s add free by allowing applications to exit.

In this phase, you’ll implement the sys_exit system call, which exits the current process. This is a capstone since freeing memory will tend to expose weaknesses and problems in your other code.

To test your work, use make run-exit. Alternately, you can press ‘e’ at any time, which his reboots WeensyOS to run the p-exit program. (Initially it’ll crash because sys_exit() isn’t implemented yet.) p-exit processes alternate among forking new children, allocating memory, and exiting. The result is that once your code is correct, p-exit makes crazy patterns forever, like this:

Fork with exit

A fully correct OS can run p-exit indefinitely. In an OS with a memory leak, the physical memory map will collect L pages (L for leak), persistent blank spots, or both, and if run long enough, these pages will take over the screen. Here’s an example; note the Ls accumulating at the end of the physical memory map:

Fork with exit and slow leak

Reducing ALLOC_SLOWDOWN in p-exit may encourage errors to manifest, but you may need to be patient.

Here’s your task.

  1. Complete kfree so that kfree frees memory. Make sure that kalloc can re-use freed memory.

  2. sys_exit should mark a process as free and free all of its memory. This includes the process’s code, data, heap, and stack pages, as well as the pages used for its page directory and page table pages. The memory should become available again for future allocations.

    Use vmiter and ptiter to enumerate the relevant pages. But be careful not to free the console.

  3. You may also need to change fork and perhaps other places. In p-exit, unlike in previous parts of the pset, sys_fork and sys_page_alloc might be called when there’s not enough memory to create a new process or allocate or map a page. Your code should handle this cleanly, without leaking memory in any situation.

    If a sys_page_alloc or sys_fork system call runs out of memory during its operation:

    • The system call must clean up after itself. Any kernel memory or other resources (e.g., process slots) allocated during the system call must be freed.

    • The system call must return -1 to the caller.

    Note that out-of-memory events in system calls must not panic the kernel, and they must not cause the calling process to exit. It is not an error when a system call fails! Instead the error return value is passed to the process to handle as it prefers.

There should be no memory leaks!

About physical memory iterators (ptiter)

Phase 8: Shared read-only memory

It’s wasteful for fork() to copy all of a process’s memory. For example, most processes, including p-fork, never change their code. So what if we shared the memory containing the code? That’d be fine for process isolation, as long as neither process could write the code.

Change process_setup to map read-only program segments using read-only memory. Use the seg.writable() function to detect whether a program segment can use read-only memory. Then, in fork, share read-only pages between processes rather than copying them; and, in exit, ensure that shared pages are correctly accounted for when a process exits.

Use vmiter::writable() to detect whether a page is read-only. Do not use an equality test like it.perm() == (PTE_P | PTE_U)! vmiter::perm() values often include bits like PTE_A and PTE_D that are automatically set by processor hardware, so equality tests on it.perm() will usually fail.

When you’re done, running p-exit should look like this:

Shared read-only memory

Each process’s virtual address space begins with an “S”, indicating that the corresponding physical page is Shared by multiple processes.

Extra credit

If you are finished and can't wait to do more of this type of work, try:

Turnin

You will turn in your code by pushing your git repository to github.com/cs61/YOUR-PSET_REPO.git and updating the grading server.