Overview
In this lecture, we discuss the Unix analogue of kernel interrupts, namely signals, and race conditions. This is a bridge lecture to the synchronization unit.
Full lecture notes — Textbook readings
The simple life
- Your program has exactly one task to perform
- If a subtask blocks, the program can wait for it indefinitely
- If a subtask aborts or receives inappropriate input, the program can abort
- Exits when done
- Examples:
cat
,sleep
,wc
,tr
,echo
…
Multitasking
- Your program has multiple tasks to perform
- Manage parallel subtasks
- If a subtask blocks or aborts, the program may need to cancel that subtask or otherwise respond
- The program must handle inappropriate input
- May not exit when done
- Examples:
sh
, Web servers…
- But tools and techniques will help!
Race conditions
- A race condition is a bug that can manifest depending on timing
- (More generally, it’s any unexpected system behavior dependent on scheduling, but the term generally refers to bugs)
Example: Relay race
- Run
echo
andcat
in parallel cat
should read whatecho
wrote
Relay race attempt 1
$ /bin/echo Handoff | cat
Relay race attempt 2
$ /bin/echo Handoff > handoff.txt & cat handoff.txt
https://www.stabroeknews.com/2016/08/19/sports/u-s-grasp-second-chance-4x100-relay/
Explanation
- In
/bin/echo Handoff > handoff.txt & cat handoff.txt
, the two programs run in parallel- They attempt to synchronize using a file,
handoff.txt
- But
handoff.txt
can be read or written at any time - Maybe
cat
reads beforeecho
writes!
- They attempt to synchronize using a file,
- In
/bin/echo Handoff | cat
, the two programs run in parallel- They attempt to synchronize using a pipe
- But reading from an empty pipe blocks the caller!
cat
will always read whatecho
wrote
Reasoning about race conditions
- Which actions can occur in parallel?
- Which actions block other actions?
- Waits-for graph
Signals
- A signal is the process control analogue for an operating system interrupt
- Used to represent events that might occur at unpredictable times and/or need
to interrupt long-running computations
- Control-C ⟶
SIGINT
kill -9
⟶SIGKILL
- Null pointer reference ⟶
SIGSEGV
- A child process exited ⟶
SIGCHLD
- Control-C ⟶
Signal system calls
sigaction
establishes a signal handler
void handle_signal(int signal_number) {
// do something to handle the signal
}
...
struct sigaction sa;
sa.sa_handler = handle_signal; // or SIG_IGN or SIG_DFL
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sigaction(SIGINT, &sa, nullptr);
kill(pid_t pid, int sig)
sends a signal- Some signals are generated automatically
When can a signal be delivered?
- In between any two instructions in the program
- Also interrupts certain system calls
- System call may return early (e.g., a short read)
- System call may return having done no work:
errno == EINTR
man 7 signal
for more- “Interruption of system calls…” for list of system calls that can return
EINTR
- Also documented on system call manual pages
- “Interruption of system calls…” for list of system calls that can return
Timed wait
- Parent process starts a child
- Waits for child to exit or 0.75 sec, whichever comes first
- Questions
- Possible race conditions?
- Solution?
timedwait-poll
- Reliable, but uses 100% CPU to wait
- Not a valuable use of CPU
Timed wait arguments
-e TIME
: child exits afterTIME
(default 0.5 sec)-t TIME
: parent timeout isTIME
(default 0.75 sec)-q
: quiet output
Polling and blocking
- Blocking: Process waits for communication
- Polling: Process checks repeatedly for communication
- Advantage of polling: Fewer race conditions
- Advantage of blocking: Lower CPU usage
timedwait-block
- Unreliable!
- If child exits immediately, signal is delivered before
usleep
, and therefore does not interruptusleep
timedwait-blockvar
- Still unreliable! (though less unreliable)
- Signal might delivered between any two instructions!
Race conditions!
- Can you use synchronous IPC to solve this race condition?
timedwait-selfpipe
- Process opens pipe to itself
- Signal handlers, for either
SIGALRM
(timeout) orSIGCHLD
(child exit), write to pipe read
will either succeed right away or be interrupted by a signal- Reliable timeout!