Processes 1: Basics

Overview

This lecture introduces the process control unit. We discuss the goals of process control and the basic system calls used to create and manage processes.

Full lecture notes on process control — Textbook readings

Processes

A process is a program in execution
Processes isolation makes processes modular
- They behave mostly independently
- An error in one process won’t crash system
- A performance problem in one process may not affect others

Process coordination

Modularity makes systems more reliable and capable
- Improvements to one component can improve the overall system
- New combinations of components can implement new functions
- Wheels + gear + pedal = bicycle—a whole greater than the sum of its parts
Let’s use multiple processes to accomplish complex tasks!
- Turn processes into components
- Take advantage of available resources: when one process is paused (e.g., accessing storage), run another
What tasks are involved in coordinating processes?
- Starting new processes
- Terminating running processes
- Monitoring processes for completion
- Communicating among processes
The prototypical program that coordinates other processes: the shell

Basic shell operation

Print a prompt
Wait for user to enter a command line
Execute the command line in a new process or processes
Wait for the command line processes to complete
Repeat

Simple shell commands

$ echo foo
foo
$ sleep 5

$ is the prompt (yours are longer)
echo foo and sleep 5 are simple command lines
sleep 5 behavior shows that the shell waits for a command to complete

Process control system calls

What sub-tasks are required for process coordination?
Create a process: fork
Run a different program: exec family
Terminate a process: _exit
Monitor completion: waitpid

Process = image + identity + environment view

Image: contents of primary memory and registers
- Code, data, stack, and heap
- Command line arguments (argc, argv)
- Directly managed by process
Identity: process names
- Process ID
- Process relationships (parent process ID)
- Ownership, timing, etc.
- Managed by kernel; process influence constrained by policy
Environment view: connections among processes and devices
- Open file descriptors, file positions
- Lives in kernel, managed by process using system calls
- Each process has its own view of the environment, but the underlying storage is shared

`fork` creates a new process

pid_t fork()
- pid_t = int
Return value:
- 0, to the new process
- Process ID (pid) of new process, to the original process
New process has:
- Cloned image
- New identity
- Cloned environment view

Process hierarchy

Every process has a parent process
- getpid system call: Return current process ID
- getppid system call: Return parent process ID
- fork creates a new child process
Root of process hierarchy is process with ID 1 (init)
- What happens if a parent process dies before its child?

`fork`: Which runs first?

forkorder.cc

The `uniq` utility

uniq searches for consecutive duplicate lines

Example 1
Example 2
Example 2
Example 3
Example 2

uniq: Print only one of each set of duplicates
uniq -c: Precede each line with a count of duplicates
uniq -u: Only print non-repeated lines
uniq -d: Only print repeated lines

`execvp` runs a new program

int execvp(const char* programname, char* const argv[])
The new program replaces the current process’s image
New process has:
- Fresh image (current image dropped, new image created for program)
- Unchanged identity
- Unchanged environment view
Returns -1 on error; otherwise does not return
- argv should start with a program name and must end with nullptr

`_exit` terminates this process

exit (= return from main)
- Process transcends the earthly sphere, leaving with its last breath a message for its parent (i.e. “terminates normally”)
- exit(STATUS) library function performs cleanup actions, such as flushing stdio files
- _exit(STATUS) system call exits without cleanup
The STATUS is the process’s ‘last words’
- Remembered by kernel
- Can be collected by the process’s parent
Status convention
- 0 (EXIT_SUCCESS) means success, non-zero (1-255) means failure

`waitpid` monitors a child process for completion

pid_t waitpid(pid_t pid, int* status, int options)
Many argument variants! We only need a simple version for now: pid > 0 && options == 0
Waits for child process pid to terminate
- After termination, status of exited process is stored in *status
- Macros WIFEXITED, WEXITSTATUS, etc. analyze status
- Returns pid on success, -1 on failure

`minishell.cc`

Question

What are the most complete assertions you can come up with that relate the p* variables? Assume fork does not fail.

pid_t p1 = getpid();
pid_t p2 = getppid();
pid_t p3 = fork();
pid_t p4 = getpid();
pid_t p5 = getppid();
assert(???);

Some answers

assert(p1 > 0 && p2 > 0 && p4 > 0 && p5 > 0): all process PIDs are >0
assert(p3 >= 0): fork did not fail (it’s >0 in parent, 0 in child)
assert(p1 != p2 && p4 != p5): a process’s PID ≠ its PPID
assert(p4 != p2): new PID ≠ original PPID
assert(p1 != p3 && p2 != p3): child PID ≠ parent or grandparent PID
assert(p3 != 0 ? p1 == p4 : p1 == p5)
- In parent (p3 != 0), original PID == new PID
- In child (p3 == 0), original PID == new PPID
assert(p3 != 0 ? p2 == p5 : p2 != p5)
- In parent (p3 != 0), original PPID == new PPID
- In child (p3 == 0), original PPID ≠ new PPID

Exit notification as a communication channel

Exiting allows a process to communicate a single byte’s worth of data to its parent!
How fast is this inter-process communication channel?
storage1/r-exitbyte 🤡🤪