Overview
In this lecture, we discuss multithreading, low-level synchronization, and synchronization objects.
Multitasking, multiprocessing, multithreading
- Multitasking
- A computer’s resources are divided among multiple tasks
- The CPU runs one task at a time
- The operating system switches between tasks: saves one task (process) state, runs another
- Each process (task) has one logical thread of control
- Each process memory image is accessed by one thread at a time (simple!)
- Software handles switching between tasks
- Multiprocessing
- Computer has two or more CPUs
- (Or a CPU and a GPU or another kind of accelerator)
- Each CPU runs one task at a time, but multiple tasks run simultaneously
- Still, each process memory image accessed by one thread at a time
- Software may need to coordinate the CPUs
- Multithreading
- A process can have two or more logical threads of control
- Those threads access a process memory image simultaneously
- Goal: Use CPU resources for a single task
- Goal: Use more natural APIs for multitasking (e.g., timeouts)
- Problem: Synchronization
Example threads
#include <cstdio>
#include <thread>
void print1() {
printf("Hello from thread 1\n");
}
void print2() {
printf("Hello from thread 2\n");
}
int main() {
std::thread t1(print1);
std::thread t2(print2);
t1.join();
t2.join();
}
std::thread th(f...)
creates a threadth
runningf(...)
th.join()
waits forth
to complete
Thread functions can take arguments
#include <cstdio>
#include <thread>
void printn(int n) {
printf("Hello from thread %d\n", n);
}
int main() {
std::thread t1(printn, 1);
std::thread t2(printn, 2);
t1.join();
t2.join();
}
- Passing reference arguments is a PITA (need
std::ref
; seethreadex3.cc
)
Detached threads
- A thread can be joinable or detached
- A joinable thread must be joined (via
th.join()
) before the program exits th.detach()
detaches the thread- It cannot be joined
- It automatically dies when the program exits
synch1/threadex4.cc
Processes vs. threads
fork()
- Cloned program image
- New identity
- Cloned environment view
- Same underlying environment
std::thread
- Same program image! (Share the same primary memory)
- Same identity!
- Same environment view!
- Same underlying environment
- What’s different?
- Different local variables
- New set of registers (held in kernel)
- New stack (allocated in primary memory)
Same program image, different stacks
Advantages of threads for synchronization
- Coexisting in the same, non-isolated memory image makes some forms of synchronization easier to express and understand
- It also opens disgusting avenues for bugs
A toy task
- Increment a variable 40,000,000 times
incr-basic.cc
What?!
- Look at
incr-basic.cc
- Use
objdump -d incr-basic.o
to look at the assembly
incr-basic-noopt
Data races
- A data race is a kind of race condition
- It occurs when two or more threads in a single process access the same
memory location concurrently
- Without explicit synchronization (explained soon)
- At least one access is a write (safe to read the same location)
- Result: undefined behavior
The Fundamental Law of Synchronization
THOU SHALT NOT HAVE DATA RACES!!!
- If two threads access the same memory location concurrently and without synchronization, both accesses must be reads
Atomic operations
- Some computer operations are atomic operations
- These operations have atomic effect
- An atomic operation occurs at a single indivisible instant
- Every other operation in the system appears to occur either entirely before or entirely after the atomic operation
- Atomic effects are possible at many interfaces
- Primary memory
- System calls (e.g.,
pipe
,open
,fork
)
addl $0x1,(%rdi)
- Doesn’t this operation have atomic effect?
- No
- The CPU implements this operation as three “micro-ops”
- First, load
(%rdi)
into a secret register - Second, compute on that secret register
- Third, store the result to
(%rdi)
- Example
Atomic instructions
- Simple loads and stores have atomic effect
- If they are aligned
- Read-modify-write instructions do not typically have atomic effect
- CPUs have special, more-expensive read-modify-write instructions that do have atomic effect
incr-atomic.cc
std::atomic<T>
types in C++ support atomic read-modify-write operationsstd::atomic<T> x; ... ++x ...
: the++x
has atomic effect- Compiles to
lock addl $0x1, (%rdi)
- The
lock
prefix makes this an atomic instruction
- The
- Concurrent accesses to
std::atomic
s do not cause data races
Mutual exclusion
- Atomic effect can be modeled as exclusion
- One thread of control (CPU or process thread) gains temporary exclusive access to some data
- Atomic instructions and atomic data types provide inexpensive exclusion, but to tiny data chunks (e.g., single integers)
- Synchronization objects are higher-level concepts that help implement exclusion on larger or more complex data types
std::mutex
- A
std::mutex
synchronization object provides interface of a lock m.lock()
: Acquire exclusive access to the mutex- Block until no other thread has exclusive access to the mutex
- Atomically mark mutex as acquired
m.unlock()
: Release access to the mutex- Mark mutex as released
- All code between
m.lock()
andm.unlock()
is subject to mutual exclusion
incr-mutex.cc
std::scoped_lock
- Acquire a mutex (or several mutexes) for a block of code
- Automatically releases the mutex when the block exits
incr-scopedlock.cc
Question
- How can we implement a mutex?
- Polling? Blocking?
incr-spinlock.cc
- The
spinlock
version polls - The
mutex
version blocks