Synchronization 2: Critical sections, serializability, lock granularity

Overview

In this lecture, we discuss multithreading, low-level synchronization, and synchronization objects.

Full notes on synchronization

Increment recap

Four threads, each executing ++n 10,000,000 times
Results like 30,000,000 (!?) or 12,583,593 (!?!?!?)
A data race is a race condition involving memory access
Two or more threads access the same memory location concurrently;
- Without explicit synchronization;
- And at least one access is a write (safe to read the same location).
Data races cause undefined behavior
Thou shalt not data race

Atomic data types address data races

Accesses involving normal data types, such as int and long, are vulnerable to data races
Atomic data types, such as std::atomic<int>, are not
- “Atomos” means “indivisible”
Operations on atomic data types are explicitly synchronized
- Concurrent accesses do not cause data races
- Support operations like ++ and .exchange()
- The result is always “correct”
- …What does “correct” mean?

Serializability

Concurrent operations should often obey a correctness condition called serializability
An implementation is serializable when:
- The result of running operations concurrently
- Always equals the result of running the same operations one at a time, in some order
  - That is, serially

Serializable increment

What are the possible outcomes of running the operations in incr-atomic.cc one at a time, in some order?

`incr-atomic.cc`

std::atomic<T> types in C++ support atomic read-modify-write operations
std::atomic<T> x; ... ++x ...: the ++x has atomic effect and is an explicit synchronization operation
Compiles to lock addl $0x1, (%rdi)
- The lock prefix makes this an atomic instruction

Atomic data types are necessary, but not sufficient

Processor caches mean that normal memory accesses are not serializable
Atomic operations force processor caches to serialize memory accesses
- At the granularity of single instructions
- Foundation of all synchronization
But atomic data types are limited in scope
- Simple 1–8 byte values
Many synchronization problems don’t map directly onto atomic operations
We need synchronization objects that leverage atomic data types, and operating system functions, to build useful semantics

CS 61 Bank

Each account has a balance
No account balance may go negative

Basic account operations

deposit(amt)
- Add amt to account balance
try_withdraw(amt)
- If account balance is less than amt, return false
- Otherwise, deduct amt from balance and return true

C++ sketch

class account {
    long bal;

    void deposit(long amt) {
        bal += amt;
    }

    bool try_withdraw(long amt) {
        if (bal < amt) {
            return false;
        }
        bal -= amt;
        return true;
    }
};

Concurrent access

Data races!
- Violates the Fundamental Law of Synchronization
- Undefined behavior

Atomics eliminate data races

class account {
    std::atomic<long> bal;

    // ... same `deposit` and `try_withdraw`
};

No data races!

Atomics don’t eliminate all race conditions

Despite absence of data races, this code allows account balances to go negative
This is because when code executes concurrently and there are no data races, the overall effect of a set of threads is the same as some interleaving of the C++-level expressions (reads and writes)
Concurrently-executing try_withdraw calls on the same account can cause trouble

Race condition example

account a with initial balance 10
- Thread T1 executes a.try_withdraw(8)
- Concurrently, thread T2 executes a.try_withdraw(7)

Possible order:

T1: a.try_withdraw(8)        a.bal         T2: a.try_withdraw(7)
    if (a.bal < 8) ...        10
                              10               if (a.bal < 7) ...
    a.bal -= 8;                2
                              -5               a.bal -= 7;
                              -5               return true;
    return true;              -5

Critical sections

The try_withdraw code implicitly assumes that bal is unmodified during its execution
That is, the if and bal -= amt statements of try_withdraw form a critical section: no other thread should modify bal during those statements
To avoid race conditions, we must enforce this critical section
Mutexes (locks)

How to delimit a critical section

Start from an access to a shared object (an object that is subject to reads and writes from different threads)
Expand to include all nearby code that refers to that object

Global mutex

std::mutex bm;

class account {
    std::atomic<long> bal;

    void deposit(long amt) {
        bm.lock();
        bal += amt;
        bm.unlock();
    }

    bool try_withdraw(long amt) {
        bm.lock();
        if (bal < amt) {
            bm.unlock();
            return false;
        }
        bal -= amt;
        bm.unlock();
        return true;
    }
};

bm protects access to any account’s balance
No two account balances may be modified simultaneously

Functions and lock state

Irritatingly, try_withdraw must release the lock in two places:

bool try_withdraw(long amt) {
    bm.lock();
    if (bal < amt) {
        bm.unlock();
        return false;
    }
    bal -= amt;
    bm.unlock();
    return true;
}

Why?
- In general, functions should leave the state of all system mutexes the same as upon entry
- (Functions should not modify system state unnecessarily)
- So if a function acquires a lock (bm.lock()), it should generally release it before returning (bm.unlock())
- On all paths

Guard pattern

Commonly, a function will acquire a lock on entry, and should release that lock just before returning

If your locks work that way, a C++ coding pattern called lock guards will improve your code

bool try_withdraw(long amt) {
    std::unique_lock guard(bm);
    if (bal < amt) {
        return false;
    }
    bal -= amt;
    return true;
}

The lock bm is acquired when the guard object is initialized, and released when guard is destroyed (just before function return)
No need for explicit unlock(), no worries about multiple exit paths

Mutexes and atomics

Now that a mutex bm protects the account balance, we no longer need atomics

std::mutex bm;

class account {
    long bal;

    void deposit(long amt) {
        std::unique_lock guard(bm);
        bal += amt;
    }

    bool try_withdraw(long amt) {
        std::unique_lock guard(bm);
        if (bal < amt) {
            return false;
        }
        bal -= amt;
        return true;
    }
};

When synchronization for on object is provided by mutexes, it is generally better to leave that object non-atomic
- Atomics tell the reader “this object might be accessed at any time”
- Correctness is harder to reason about, induces a feeling of dread
- Sanitizers can’t catch “data races” on atomics (because there are no data races on atomics)

But in this case, we need a separate method to read the balance

long balance() const {
    std::unique_lock guard(bm);
    return bal;
}

Lock granularity

The global mutex is an instance of coarse-grained locking
- One mutex covers many different objects
- Simpler to code (acquire one mutex to protect most state)
- Less space required for synchronization objects
- Less parallelism
If we want more parallelism, implement fine-grained locking
- One mutex covers few objects
- More difficult to code (programmer must know how synchronization objects match state; may need to acquire several locks to protect several objects)
- More space required for synchronization objects
- More parallelism

Fine-grained account locking

class account {
    long bal;
    std::mutex am;

    void deposit(long amt) {
        std::unique_lock guard(am);
        bal += amt;
    }

    bool try_withdraw(long amt) {
        std::unique_lock guard(am);
        if (bal < amt) {
            return false;
        }
        bal -= amt;
        return true;
    }
};