Threads in C++
std::thread
's constructor takes the name of the function to run on the new
thread and any arguments, like std::thread t(func, param1, ...)
.
Given a thread t
, t.join()
blocks the current thread until the thread
being joined exits. (This is roughly analogous to waitpid
.) t.detach()
sets
the t
thread free. Each thread object must be detached or joined before it
goes out of scope (is destroyed).
An example of how we might combine these is:
#include <thread>
void foo(...) {
...
}
void bar(...) {
...
}
int main() {
{
std::thread t1(foo, ...);
t1.detach();
}
{
std::thread t2(bar, ...);
t2.join();
}
}
Exercise: What can this code print? (sample01.cc
)
int main() {
for (int i = 0; i != 3; ++i) {
std::thread t(printf, "%d\n", i + 1);
t.join();
}
}
1\n2\n3\n
Exercise: What can this code print? (sample02.cc
)
int main() {
for (int i = 0; i != 3; ++i) {
std::thread t(printf, "%d\n", i + 1);
t.detach();
}
}
1
2
3
, in any order, and with any number of those lines left out—because when the main thread exits (by returning frommain
) all other threads are killed too!
Exercise: What can this code print? (sample03.cc
)
int main() {
for (int i = 0; i != 3; ++i) {
std::thread t(printf, "%d\n", i + 1);
}
}
Literally anything. This is not C++ code: when the
t
thread goes out of scope without having been joined or detached, it invokes undefined behavior.
Threads in C
C doesn't have built-in threads or synchronization, but a common library is pthread
(POSIX
threads). Since we're using C++ threads in this class we won't talk about
details here, but be aware that it exists in case you're ever working in C.
(On Linux, standard C++ threads are implemented using the pthread
library.)
Mutexes in C++
Recall the high-level definition of a mutex: we want some way to synchronize access to data that is accessed simultaneously by multiple threads to avoid race conditions. Mutexes (also known as locks) achieve this by providing mutual exclusion, meaning that only one thread can be running in a particular critical section at any time.
The abstract interface for a mutex or lock is:
lock
: takes hold of the lock in a blocking fashiontry lock
: takes hold of the lock in a polling fashion; returns a boolean indicating whether the lock attempt succeededunlock
: instantly lets go of the lock
Note that it is disallowed to lock a lock that you already hold or unlock a lock that you do not hold. Depending on the implementation or language you're using, these actions might explicitly fail or—even worse—silently fail, and cause deadlock or incorrect synchronization.
If you've ever had the pleasure of participating in an icebreaker, you can think of a single lock as the object that is passed around to allow people to speak. You have to acquire the object before you can speak, and you release it so that another person can use it once you're done. Because there's one object and only one person can hold it at once, only one person can speak at a time, achieving mutual exclusion.
For multiple mutexes, a better analogy is the locks on a set of bathroom stalls.
When you enter a stall (the critical section) you lock the lock associated with
that stall. Then, you perform whatever operation needs to be synchronized inside
the stall. Finally, once you're done, you unlock the stall and go on your merry
way. This system ensures that each stall can only have one person in it at a
time. Here the difference between lock
and try lock
is better motivated: if
one particular stall is full, we might want to check to see if other stalls are
available instead of waiting for someone to leave that one.
std::mutex
Here are some of the tools you can use to synchronize multithreaded C++ code (and here is some even more comprehensive documentation).
C++ has a few different kinds of locks built in. std::mutex
is the basic
mutex. Once we declare and construct a std::mutex m
, we can call:
m.lock()
m.try_lock()
m.unlock()
which correspond exactly to the abstract lock operations defined above. A typical use might look like this, though later we're going to talk about an easier way to get the same behavior:
#include <mutex>
int i;
std::mutex m; // protects write access to i
void increase() {
m.lock();
i++;
m.unlock();
}
...
Fancier mutex semantics
std::recursive_mutex
is like a normal mutex, but a thread that holds a lock
may recursively acquire it again (as long as they recursively release it the
same number of times). This might be useful if you need to acquire several
locks, and sometimes some of those locks might actually be the same lock: with
recursive locks, you don't need to keep track of which ones you already hold,
to avoid acquiring the same lock twice.
std::shared_lock
is an instantiation of a different kind of lock altogether:
a readers-writer lock, or a shared-exclusive lock. The idea behind this kind of
lock is that rather than just have one level of access--mutual exclusion--there
are two levels of access: shared access and exclusive access. When a thread
holds the lock with shared access, that guarantees that all other threads that
hold the lock have shared access. When a thread holds the lock with exclusive
access, that guarantees that it is the only thread to have any kind of access.
We call this a readers-writer lock because the two levels of access often
correspond to multiple reader threads, which require the shared data not to
change while they're accessing it, but don't modify it themselves; and a single
writer thread, which needs to be the only thread with access to the data it's
modifying. You do not need the functionality of std::shared_lock
for the
problem set, and in fact it is not particularly helpful for this particular
application.
The only mutex you need for the problem set is std::mutex
, though
std::recursive_mutex
may simplify some logic for simpong61
.
std::lock
avoids deadlock
C++ also defines some handy tools to acquire locks while avoiding deadlock.
Instead of calling .lock()
on the mutex itself, you can pass several mutexes
as arguments to the function std::lock
, which magically acquires the locks
in an order that avoids deadlock. For example:
#include <mutex>
std::mutex a;
std::mutex b;
void worker1() {
std::lock(a, b);
...
a.unlock(); // the order of these doesn't matter,
b.unlock(); // since unlocking can't cause deadlock
}
void worker2() {
std::lock(b, a); // look ma, no deadlock!
...
b.unlock();
a.unlock();
}
...
std::unique_lock
and std::scoped_lock
unlock for you
The std::scoped_lock
and std::unique_lock
objects automate some aspects of
locking, because they are capable of automatically unlocking. This is really
convenient—it’s very easy to forget a call to unlock
if a function has
multiple return
statements in it.
To use a scoped lock, you declare an object at the point where you would
normally call lock()
. This object is typically called a guard, since it
exists to “guard” the lock. You must pass in a reference to the std::mutex
that you intend to lock. For example, here’s a version of the increase
function from above:
#include <mutex>
int i;
std::mutex m; // protects write access to i
void increase() {
std::unique_lock guard(m);
i++;
}
The declaration of guard
will block until m
can be locked, and then m
is
automatically unlocked when the function returns and guard
goes out of
scope (the variable stops existing). This is an instance of an idiom called
Resource acquisition is initialization (RAII), which is the idea that an
object's resources are automatically acquired and released by its constructor
and destructor, which are triggered by scope.
Here’s how std::scoped_lock
(which is the multi-lock version of
std::unique_lock
) would simplify the code above:
#include <mutex>
std::mutex a;
std::mutex b;
void worker1() {
std::scoped_lock guard(a, b);
...
}
void worker2() {
std::scoped_lock guard(b, a);
...
}
...
The std::unique_lock
has .lock()
, .try_lock()
, and .unlock()
member
functions, in addition to the initialization/deinitialization style, and these
can be useful sometimes. (The std::scoped_lock
has no member functions at
all: you can only use it by declaring it and then letting it go out of scope
later.)
Condition variables in C++
Suppose you have several threads waiting for some event to happen, which doesn't neatly correspond to a single unlock event. (For example, to go back to the bathroom example, we might want people lining up to wait for any stall to open, not some particular stall door to be unlocked.) We could achieve this behavior naively by polling, but we've seen that this is wasteful. Condition variables are a synchronization primitive built on top of mutexes that allows us to form a queue of threads, which wait in line for a wake up call. (Note that condition variables usually do not actually guarantee that they function as FIFO queues; an arbitrary thread may be the head of the queue.)
The abstract interface for a condition variable is:
wait
: go to sleep on the condition variable and a held lock until signalednotify one
orsignal
: wake up one thread waiting on the CVnotify all
orbroadcast
: wake up all threads waiting on the CV
Once we create a condition variable std::condition_variable cv
in C++, we can
call:
cv.wait(std::unique_lock<std::mutex>& lock)
cv.notify_one()
cv.notify_all()
Notice that cv.wait
requires that you use std::unique_lock
to lock your
mutex. The lock must be locked when calling wait()
. wait()
unlocks
the associated mutex while waiting, and re-locks the mutex before it
returns.
Condition variables are almost always used in loops that look like this:
std::unique_lock<std::mutex> lock;
...
while (!CONDITION) {
cv.wait(lock);
}
In this code:
CONDITION
is some expression, such asphase != 1
.- The
CONDITION
can safely be evaluated whenlock
is locked. cv
is associated withCONDITION
in the following way: Any code that might change the truth value ofCONDITION
will callcv.notify_all()
.
The loop is important because in most cases, a condition variable wait
can
wake up spuriously—in other words, by the time the wait
operation returns,
the condition has gone back to false.
If you want to use a condition variable with something other than a
std::mutex
, you'll have to usestd::condition_variable_any
instead.
Synchronization: Counting to three
This program is called onetwothree.cc
:
#include <thread>
#include <mutex>
#include <condition_variable>
#include <cstdio>
/* G1 */
void a() {
/* A1 */
printf("1\n");
/* A2 */
}
void b() {
/* B1 */
printf("2\n");
/* B2 */
}
void c() {
/* C1 */
printf("3\n");
/* C2 */
}
int main() {
/* M1 */
std::thread at(a);
std::thread bt(b);
std::thread ct(c);
/* M2 */
ct.join();
bt.join();
at.join();
}
Exercise: What are the possible outputs of this program as is?
1
,2
, and3
, on separate lines, in any order. (printf
is thread-safe, so no two calls toprintf
will interleave characters.)
Exercise: The program will never exit with the output 1\n2\n
. Why not?
The
ct.join()
line means that threadct
, which is runningc()
, must exit before the main program exits.
Exercise: Add code to this program so that the program only prints
1\n2\n3\n
. Your code must only use atomics, and you can only change the
locations marked with comments.
G1: std::atomic<int> phase; A2: phase.store(1); B1: while (phase.load() != 1) {} B2: phase.store(2); C1: while (phase.load() != 2) {}
Exercise: Does your atomics-based code wait using blocking or polling?
Polling
Exercise: Add code to this program so that the program only prints
1\n2\n3\n
. Your code must only use std::mutex
and/or
std::condition_variable
, and you must not ever unlock a mutex in a thread
different from the thread that locked the mutex. Again, only change the
locations marked with comments.
G1: int phase = 0; std::mutex mutex; std::condition_variable cv; A2: std::unique_lock<std::mutex> guard(mutex); phase = 1; cv.notify_all(); B1: std::unique_lock<std::mutex> guard(mutex); while (phase != 1) { cv.wait(lock); } B2: phase = 2; cv.notify_all(); C1: std::unique_lock<std::mutex> guard(mutex); while (phase != 2) { cv.wait(lock); }
Deadlock
A deadlock is a situation in which two or more threads block forever, because each thread is waiting for a resource that’s held by another thread.
Deadlock in multithreaded programming commonly involves mutual exclusion
locks. For instance, thread 1 might have mutex m1
locked while it tries to
acquire mutex m2
, while thread 2 has m2
locked while it tries to acquire
m1
. Neither thread will ever make progress because each is waiting for the
other.
There are simple ways to avoid deadlock in practice. One is to lock at most
one mutex at a time: no deadlock! Another is to have a fixed order in which
mutexes are locked, called a lock order. For instance, given lock order
m1
, m2
, then thread 2 above cannot happen (it’s acquiring locks against
the lock order).
Deadlocks can also occur with other resources, such as space in pipe buffers (remember the extra credit in pset 3?) and even pavement.
Bathroom synchronization
Ah, the bathroom. Does anything deserve more careful consideration? Is anything more suitable for poop jokes?
Bathroom 1: Stall privacy
In this exercise, the setting is a bathroom with K stalls numbered 0 through K–1. A number of users enter the bathroom; each has a preferred stall. Each user can be modeled as an independent thread that repeatedly calls the following code:
void user(int preferred_stall) {
poop_into(preferred_stall);
}
Each user has a preferred stall that never changes, and two users should never
poop_into
the same stall at the same time.
You probably can intuitively smell that this bathroom model has a synchronization problem. Answer the following questions without referring to the solutions first.
Question: What synchronization property is missing?
Mutual exclusion (two users can poop into the same stall at the same time).
Question: Provide synchronization using a single mutex, i.e., using
coarse-grained locking. Your solution should be correct (it should avoid
gross conditions [which are race conditions in the context of a
bathroom-themed synchronization problem]), but it need not be efficient.
Your solution will consist of (1) a definition for the mutex, which must be a
global variable, and (2) new code for user()
. You don’t need to sweat the
syntax.
std::mutex bmutex; void user(int preferred_stall) { std::unique_lock guard(bmutex); poop_into(preferred_stall); }
Question: Provide synchronization using fine-grained locking. Your
solution should be more efficient than the previous solution.
Specifically, it should allow the bathroom to achieve full
utilization: it should be possible for all K stalls to be in use at the
same time. You will need to change both the global mutex—there will be
more than one global mutex—and the code for user()
.
std::mutex stall[K]; void user(int preferred_stall) { std::unique_lock guard(stall[preferred_stall]); poop_into(preferred_stall); }
Bathroom 2: Stall indifference
The setting is the same, except that now users have no predetermined preferred stall:
void user() {
int preferred_stall = random() % K;
poop_into(preferred_stall);
}
Question: Adapt your solution from Bathroom 1 to work for this variant.
std::mutex stall[K]; void user() { int preferred_stall = random() % K; std::unique_lock guard(stall[preferred_stall]); poop_into(preferred_stall); }
Question: As written, the user code picks a preferred stall and then waits for that stall to become available. This isn’t as efficient as possible, since multiple users might pick the same stall by chance. Write a synchronization solution where each user chooses any available stall. First, use a single mutex coupled with a global data structure representing stall occupancy.
std::mutex bmutex; bool occupied[K]; void user() { int preferred_stall = -1; while (preferred_stall == -1) { std::unique_lock<std::mutex> guard(bmutex); for (int stall = 0; stall != K; ++stall) { if (!occupied[stall]) { preferred_stall = stall; occupied[stall] = true; break; } } } poop_into(preferred_stall); std::unique_lock<std::mutex> guard(bmutex); occupied[preferred_stall] = false; }
Question: Repeat this function, but this time use multiple mutexes and no other global data structure.
std::mutex smutex[K]; void user() { while (true) { int preferred_stall = random() % K; if (smutex[preferred_stall].try_lock()) { poop_into(preferred_stall); smutex[preferred_stall].unlock(); return; } } }
Question: Compare these two implementations in terms of the number of
std::mutex
lock attempts it might take to poop.
The second solution may perform more synchronization calls, because it may make a synchronization call on many locked stalls before finding an unlocked stall. But the first solution performs a minimum of 4 synchronization calls, while the second solution performs a minimum of 2.
Question: These solutions implement a polling style. Why?
They repeatedly call
try_lock
, or they repeatedly callunlock
+lock
, while waiting for a stall to vacate.
Question: Implement a blocking implementation. You will need a
condition_variable
.
std::mutex bmutex; std::condition_variable cv; bool occupied[K]; int find_unoccupied_stall() { for (int i = 0; i != K; ++i) { if (!occupied[i]) { return i; } } return -1; } void user() { std::unique_lock<std::mutex> lock(bmutex); int preferred_stall; while (true) { preferred_stall = find_unoccupied_stall(); if (preferred_stall != -1) { occupied[preferred_stall] = true; break; } cv.wait(lock); } lock.unlock(); poop_into(preferred_stall); lock.lock(); occupied[preferred_stall] = false; cv.notify_all(); lock.unlock(); }
Bathroom 3: Pooping philosophers
To facilitate the communal philosophical life, Plato decreed that philosophy should be performed in a bathroom with K thrones and K curtains, arranged like so:
Each philosopher sits on a throne with two curtains, one on each side. The philosophers normally argue with one another. But sometimes a philosopher’s got to poop, and to poop one needs privacy. A pooping philosopher must first draw the curtains on either side; philosopher X will need curtains X and (X+1) mod K. For instance, here, philosophers 0 and 2 can poop simultaneously, but philosopher 4 cannot poop until philosopher 0 is done (because philosopher 4 needs curtain 0), and philosopher 1 cannot poop until philosophers 0 and 2 are both done:
Question: In high-level terms, what is the mutual exclusion problem here—that is, what resources are subject to mutual exclusion?
Curtains. The mutual exclusion problem here is that each philosopher must have 2 curtains in order to poop, and a curtain can only be held by one philosopher at a time.
A philosopher could be described like this:
void philosopher(int x) {
while (true) {
argue();
draw curtains x and (x + 1) % K;
poop();
release curtains x and (x + 1) % K;
}
}
Question: Write a correctly-synchronized version of this philosopher. Your
solution should not suffer from deadlock, and philosophers should block (not
poll) while waiting to poop. First, use std::scoped_lock
—which can (magically) lock
more than one mutex at a time while avoiding deadlock.
std::mutex curtains[K]; void philosopher(int x) { while (true) { argue(); std::scoped_lock guard(curtains[x], curtains[(x + 1) % K]); poop(); } }
Question: Can you do it without std::scoped_lock
?
Sure; we just enforce our own lock order.
std::mutex curtains[K]; void philosopher(int x) { while (true) { argue(); int l1 = std::min(x, (x + 1) % K); int l2 = std::max(x, (x + 1) % K); std::unique_lock<std::mutex> guard1(curtains[l1]); std::unique_lock<std::mutex> guard2(curtains[l2]); poop(); } }