Contents
- Kernel and processes
- Resource: Processor time
- Current privilege level
- Memory-mapped I/O
- Protected control transfer
Kernel and processes
The kernel is the operating system software that runs with full machine privilege, meaning full privilege over all machine resources.
Processes, in contrast, are software that run without full machine privilege. A process is a program in execution. The difference between a program and a process is like the difference between a recipe and a cake: the program is a dead list of instructions—a file on disk—while a process is a live instance of that program, running at a particular time, on a particular piece of hardware, dealing with a particular set of inputs.
(Processes are often called unprivileged processes or user-level processes, to emphasize their unprivileged status. “User level” is the opposite of “kernel”.)
The kernel’s purpose is to serve the needs of processes as a whole. Unfortunately, processes can have bugs. A process can crash, or enter an infinite loop, or attempt to take over the machine, maliciously or accidentally. So kernels should prevent mistakes in individual processes from bringing down the system as a whole.
In modern operating systems, much kernel code aims to provide protection: ensuring that no process can violate the operating system’s sharing policies.
Kernels balance three goals:
- Fairly share machine resources among processes.
- Provide safe and convenient access to machine resources by inventing abstractions for those resources (such as files, which abstract disks).
- Ensure robustness and performance.
Kernels can achieve these goals only with help from hardware. A running process executes on a processor; that processor executes the process’s instructions, one after another, as fast as possible. The kernel is not emulating the processor—that would be very slow. Instead, the processor truly runs the process’s instructions: the process controls the processor. If done naively, this control would violate protection. So processors offer special mechanisms, accessible only to privileged code (the kernel), that ensure that the kernel can always assert control over every resource.
Resource: Processor time
One of the most fundamental machine resources is processor time (or CPU time): the fraction of time the processor spends executing one process’s instructions rather than another’s. The kernel aims to share processor time according to its policy.
Here’s a fundamental attack on fair sharing of processor time. It’s the worst attack in the world:
int main() {
while (true) {
}
}
An infinite loop. Compiled to x86-64 instructions, this might be
00000000000005fa <main>:
5fa: 55 push %rbp
5fb: 48 89 e5 mov %rsp,%rbp
5fe: eb fe jmp 5fe <main+0x4>
The critical instruction is jmp 5fe
, represented in bytes as eb fe
, which
spins the processor in a tight loop forever.
Aside. Why is this loop represented as
0xeb 0xfe
? An instruction consists of an opcode (e.g., “push”, “mov”, “pop”) and some operands (e.g., “%rbp”, “5fe”). Here, the0xeb
part is the opcode. This opcode means “unconditional branch (jmp
) by a relative one-byte offset”: when the instruction is executed, the%rip
register will be modified by adding to it the signed offset stored as an operand. Here, that operand is0xfe
, which, considered as a signed 8-bit number, is -2. Remember that when an instruction executes, the initial value of%rip
is always the address of the next instruction (because the processor must read the entire current instruction before executing it). Thus, adding -2 to%rip
will reset%rip
back to the start of thejmp
.
Processors generally execute the instructions they’re given in a simple-minded, straightforward way. If a processor starts executing an infinite loop, how will any other instruction ever run?
We need a way to limit the time that any single process can run on the CPU. After that time elapses, the processor should interrupt its execution and switch to the kernel, giving the kernel a chance to run something else.
Machines accomplish this with a separate piece of hardware called the timer. This timer can be configured by the kernel to go off periodically in real time, such as once every millisecond. When the timer goes off, it sends an interrupt to the processor, which gives the processor the chance to run something else.
Timer interrupts are an almost inevitable consequence of the problem of infinite loops. Many other aspects of timer interrupt implementation also follow logically from the problem timer interrupts aim to solve.
Any process that runs for too long must be interrupted by a timer. Therefore, processes must not be allowed to configure the timer: if they could, they could disable the timer or set it to go off once a year.
However, the kernel should be allowed to configure the timer. Every operating system wants to prevent processes from monopolizing CPU time, but different operating systems enforce very different policies in detailed terms. (Some processes might have priority over others, for example.)
Since the kernel, which is software, can configure the timer, but processes, which are also software, cannot, the processor must support different privilege modes, so that attempts to configure the timer can be distinguished.
A timer interrupt can occur at any time during process execution. This doesn’t indicate a bug in the process—maybe the process is just executing a long-running task—so an interrupted process should be able to pick up exactly where it left off, with all of its registers restored to their original values.
This marks an important difference with function calls. In a function call, the caller voluntarily transfers control to another piece of software. Since the control transfer is voluntary, the caller can prepare for it and implement a calling convention, saving any important registers to the stack and restoring them later. But in an interrupt, the process involuntarily transfers control to the kernel. The process cannot fully prepare.
The processor and kernel’s interrupt handling mechanisms are carefully engineered to save all processor state, allowing the interrupted processes to resume later as if nothing had occurred. This is an instance of protected control transfer.