Overview
In this lecture, we discuss multithreading, low-level synchronization, and synchronization objects.
Multitasking, multiprocessing, multithreading
- Multitasking
- A computer’s resources are divided among multiple tasks
- The CPU runs one task at a time
- The operating system switches between tasks: saves one task (process) state, runs another
- Each process (task) has one logical thread of control
- Each process memory image is accessed by one thread at a time (simple!)
- Software handles switching between tasks
- Multiprocessing
- Computer has two or more CPUs
- (Or a CPU and a GPU or another kind of accelerator)
- Each CPU runs one task at a time, but multiple tasks run simultaneously
- Still, each process memory image accessed by one thread at a time
- Software may need to coordinate the CPUs
- Multithreading
- A process can have two or more logical threads of control
- Those threads access a process memory image simultaneously
- Goal: Use CPU resources for a single task
- Goal: Use more natural APIs for multitasking (e.g., timeouts)
- Problem: Synchronization
Creating and joining threads in C++
A toy task
- Increment a variable 40,000,000 times
incr-basic.cc
What?!
- Look at
incr-basic.cc
- Use
objdump -d incr-basic.o
to look at the assembly
incr-basic-noopt
Data races
- A data race is a kind of race condition
- It occurs when two or more threads in a single process access the same
memory location concurrently
- Without explicit synchronization (explained soon)
- At least one access is a write (safe to read the same location)
- Result: undefined behavior
The Fundamental Law of Synchronization
THOU SHALT NOT HAVE DATA RACES!!!
- If two threads access the same memory location concurrently and without synchronization, both accesses must be reads
Atomic operations
- Some computer operations are atomic operations
- These operations have atomic effect
- An atomic operation occurs at a single indivisible instant
- Every other operation in the system appears to occur either entirely before or entirely after the atomic operation
- Atomic effects are possible at many interfaces
- Primary memory
- System calls (e.g.,
pipe
,open
,fork
)
addl $0x1,(%rdi)
- Doesn’t this operation have atomic effect?
- No
- The CPU implements this operation as three “micro-ops”
- First, load
(%rdi)
into a secret register - Second, compute on that secret register
- Third, store the result to
(%rdi)
- Example
Atomic instructions
- Simple loads and stores have atomic effect
- If they are aligned
- Read-modify-write instructions do not typically have atomic effect
- CPUs have special, more-expensive read-modify-write instructions that do have atomic effect
incr-atomic.cc
std::atomic<T>
types in C++ support atomic read-modify-write operationsstd::atomic<T> x; ... ++x ...
: the++x
has atomic effect- Compiles to
lock addl $0x1, (%rdi)
- The
lock
prefix makes this an atomic instruction
- The
- Concurrent accesses to
std::atomic
s do not cause data races
Mutual exclusion
- Atomic effect can be modeled as exclusion
- One thread of control (CPU or process thread) gains temporary exclusive access to some data
- Atomic instructions and atomic data types provide inexpensive exclusion, but to tiny data chunks (e.g., single integers)
- Synchronization objects are higher-level concepts that help implement exclusion on larger or more complex data types
std::mutex
- A
std::mutex
synchronization object provides interface of a lock m.lock()
: Acquire exclusive access to the mutex- Block until no other thread has exclusive access to the mutex
- Atomically mark mutex as acquired
m.unlock()
: Release access to the mutex- Mark mutex as released
- All code between
m.lock()
andm.unlock()
is subject to mutual exclusion
incr-mutex.cc
std::scoped_lock
- Acquire a mutex (or several mutexes) for a block of code
- Automatically releases the mutex when the block exits
incr-scopedlock.cc
Question
- How can we implement a mutex?
- Polling? Blocking?
incr-spinlock.cc
Spinning vs. blocking
- The
spinlock
version polls - The
mutex
version blocks - Implementing a blocking mutex is not trivial!
- Sleep–wakeup race conditions
Minipipe
read
,write
,close_read
,close_write
read
: Read one character from pipe- Block until character is present or write end closed
write
: Write one character to pipe- Block until space is present or read end closed
- How to synchronize?