Problem set 6: Flockchain

In this problem set, you’ll implement synchronization for your buffered I/O library.

Get the code

Get our code with

$ git pull; git pull handout main

or, alternately

$ git pull; git pull https://github.com/cs61/cs61-f24-psets.git main

This will merge our Problem Set 6 code with your previous work. If you have any “conflicts” from prior problem sets, resolve them before continuing further. Run git push to save your work back to your personal repository.

You may also create a new cs61-psets repository for this assignment. Don’t forget to enter your repository URL on the grading server.

Goal

Your IO61 library from Problem Set 4 works great (hopefully) for single-threaded processes—congratulations! Unfortunately, it does not work well for multi-threaded processes. When two threads in the same process try to read from or write to the same io61_file, those threads will cause data races and undefined behavior.

In this problem set, we provide you with a new IO61 library—a single-slot cache—that you will modify to:

Crypto crypto crypto

Famous billionaire Scam Bilkman-Fleed runs a high-frequency crypto trading exchange. Our handout code contains accounts.fdb, an example of the kind of account database used in his exchange:

SBF     1002034
Alameda  291573
FTX       52180
Aaliyah    3193
Aaron       650
Abel       3351
Abigail    2510
Abraham    5635
Ace        8555
...

Each line contains the account owner’s name, one or more spaces, and then the balance, and must be exactly 16 characters long (including the terminating newline).

Scam’s trading platform runs in a single multithreaded process. (See ftxunlocked.cc.) To make a trade, a worker thread:

  1. Picks two accounts to trade using a proprietary Liquidity Engine (i.e., a random number generator).
  2. Reads the accounts’ current balances using positioned I/O calls in the IO61 library.
  3. Transfers money from one account to the other.
  4. Writes the accounts’ new balances back to the underlying file, again using positioned I/O.

Then it repeats. All of these trades just move money between accounts, so the total amount of money in the exchange should remain constant.

Unfortunately, ftxunlocked.cc is unsynchronized and doesn’t work. You can see this by running ./ftxunlocked; ./diff-ftxdb.pl (shown below). The ./diff-ftxdb.pl script checks that two account files (by default accounts.fdb, the original database, and /tmp/newaccounts.fdb, the modified database) have the same accounts and the same total assets. ./ftxunlocked violates this assumption:

$ make TSAN=0
$ ./ftxunlocked; ./diff-ftxdb.pl
/tmp/newaccounts.fdb: Incorrect exchange total 4017050
accounts.fdb: Expected 4000000

Note that we disabled the thread sanitizer to show this result. By default, make has the thread sanitizer enabled. If you make and run the above programs, you will see that the thread sanitizer detects this data race, which is pretty helpful for debugging!

Your work

All your code for this pset will go in io61.cc, though you should look at some of the other files to see what’s going on—especially ftxdb.hh and ftxxfer.cc.

In Phases 1–3, you will change the struct io61_file definition and the io61_lock, io61_try_lock, and io61_unlock functions. In Phase 4, you will change many of the IO61 functions.

We provide make check support for checking your work after you have completed all phases. The phase descriptions will instruct you to check your work by running certain commands directly at a command shell.

By default, make compiles with the thread sanitizer and undefined behavior sanitizer enabled. These are incredibly helpful for debugging, but compiling with sanitizers does slow down your program. As you work through each phase of the problem set, we recommend that you first compile with sanitizers enabled (make). Once your code is correct (correct output, no data races, and no undefined behavior), you can disable the sanitizers (make SAN=0) and run again for a better sense of your program's runtime.

Docker can slow down your program, too. You might try running your code locally, or on the grading server (ideally at a time when few other students are also submitting to the grading server.)

You should commit your code after each phase to save your work!

Phase 1: Coarse-grained file range locks

ftxxfer.cc attempts to transfer money safely, even in multithreaded situations, by locking accounts before making transfers. These locks are implemented as file range locks, which are like mutexes for a specific range of offsets in a file. Before performing a transfer, ftxxfer.cc locks the file ranges corresponding to the two relevant accounts; this should prevent concurrent access to the accounts involved in the transfer. Unfortunately, the library calls that implement file range locks are currently unimplemented! In this phase you will implement them.

IO61 file range locks use these function calls:

int io61_try_lock(io61_file* f, off_t off, off_t len, int locktype)

Try to obtain a range lock for offsets [off, off + len) of file f. Return 0 if the range lock was obtained, and -1 if it could not be obtained. This function does not block; if a range lock is already held overlapping that range, it returns -1 right away. In the pset, the locktype argument is always LOCK_EX to signify an exclusive lock.

int io61_lock(io61_file* f, off_t off, off_t len, int locktype)

Like io61_try_lock, but retries the range lock until it is obtained. Always returns 0.

int io61_unlock(io61_file* f, off_t off, off_t len)

Unlock the current thread’s range lock for offsets [off, off + len) of file f and return 0. It is undefined behavior if this thread doesn’t hold a lock on that specific range.

In Phase 1, implement these functions using a single std::recursive_mutex per file. A std::recursive_mutex is required because ftxxfer.cc locks both accounts involved in the transfer.

You will also need to fix one data race in io61_pwrite, on the line that sets f->dirty = true; you can use the same std::recursive_mutex for this, or, even simpler, give the dirty member an atomic type.

This should make ./ftxxfer; ./diff-ftxdb.pl report no errors—though it may take a long time, especially in Docker. You will improve the runtime in the next few phases. Test your work with sanitizers (make) to check for data races. When you have correct results and no data races, move on to the next phase.

$ make clean
$ make
$ ./ftxxfer && ./diff-ftxdb.pl
4 threads, 400000 operations, 0.681209s CPU time, 2.484973s real time
/tmp/newaccounts.fdb OK
$ ../cs61-run-docker
cs61-user@aea282d44262:~/cs61-psets/pset6$ make clean
cs61-user@aea282d44262:~/cs61-psets/pset6$ make && ./ftxxfer && ./diff-ftxdb.pl
...
4 threads, 400000 operations, 6.199521s CPU time, 44.704464s real time
/tmp/newaccounts.fdb OK

Notes on file range locks

File range locks are advisory, as opposed to mandatory, with respect to the underlying file data. This means that holding a lock on a range of offsets does not prevent other threads from reading or writing that range! For instance, a rogue thread can io61_write a locked range without acquiring the lock. File range locks provide correct synchronization only when all threads use them. This resembles the way that std::mutex works. Locking a mutex does not prevent other threads from accessing the protected data—synchronization is correct only when all threads use the mutex correctly.

The off and len arguments must be ≥0. If len == 0, the requested range is empty, and any attempt to lock or unlock the range succeeds (returns 0) without making any changes.

Actual operating systems support process-based file locks and file range locks with a similar API. That is, at most one process can acquire a lock on a particular range. On Unix variants, flock manages file locks and fcntl’s F_SETLK/F_SETLKW command manages file range locks; on Windows, LockFile manages file and file range locks. Unix’s file locks are purely advisory (normal read and write system calls ignore locks). In Windows, file locks are semi-advisory: they prevent normal reads and writes, but not mmap.

Phase 2: Fine-grained file range locks

Your coarse-grained file range lock implementation provides correctness, but it reduces performance by restricting parallelism: only one transfer can ever proceed at a time. Since accounts are independent, transactions that access disjoint sets of accounts should be allowed to execute in parallel: one thread transferring funds between Alice and Adonis should be allowed to execute in parallel with another thread transferring funds between Hendrix and Malakai.

In Phase 2, replace your coarse-grained file range lock implementation with finer-grained locks. This should speed up the exchange; for example:

$ ./ftxxfer  # coarse-grained lock
4 threads, 400000 operations, 0.681209s CPU time, 2.484973s real time
$ ./ftxxfer  # finer-grained locks
4 threads, 400000 operations, 0.519218s CPU time, 0.753223s real time

(In Docker, all these times will be much higher.) Remember that make compiles with sanitizers, so make SAN=0 will yield better performance.

Improvements are visible for ftxxfer, which trades among randomly-selected accounts. However, fine-grained synchronization becomes even more important when some accounts or objects are busier than others. The ftxrocket.cc program trades more often with certain important accounts. [Run ./ftxrocket and look at /tmp/newaccounts.fdb to see how big crypto exchanges work in reality.] With ./ftxrocket, fine grained synchronization matters even more:

$ ./ftxrocket  # coarse-grained lock
4 threads, 400000 operations, 0.674707s CPU time, 2.465647s real time
$ ./ftxrocket  # finer-grained locks
4 threads, 400000 operations, 0.538521s CPU time, 1.288776s real time
$ ./ftxrocket  # finest-grained locks
4 threads, 400000 operations, 0.629610s CPU time, 0.799237s real time

When your fined-grained locking implementation is correct for ./ftxxfer and ./ftxrocket, move on to the next phase. You won't pass all of the tests in make check until you've completed all phases.

Implementation notes

There are several ways to implement finer-grained locking. For example, you can divide the file into multiple regions (say, 128 regions per file), and lock just the regions that overlap the lock request. Or you could add an auxiliary data structure to track currently-locked ranges. You could have one mutex per region as different mutexes, or instead use a single mutex in io61_file plus one or more condition variables.

As you implement your design, be aware that mutexes and condition variables are special objects and cannot be used in some ways. Your io61_file may contain an array of mutexes (as std::mutex ms[1024];), or a dynamically-allocated array of mutexes (as std::mutex* ms; ... f->ms = new std::mutex[nmutexes];), or a list or map of mutexes or data structures containing mutexes (as std::list<std::mutex> mlist;). It may not, however, have a std::vector of mutexes or of any structures containing one or more mutexes or condition variables. This is because mutexes and condition variables cannot be copied or moved to different addresses once they are created, and vectors inherently copy their contents when they grow.

The ftx test programs always obtain locks on 16-byte-aligned units of 16 bytes. Your locking design may work fastest for such lock requests, but it must provide correct mutual exclusion for any lock requests—including overlapping lock requests. [For example, if a thread has an exclusive range lock on offsets [12,27) in file f, then io61_try_lock(f, 25, 40) must return -1 because the requested range [25,65) overlaps with the locked range [12,27).]

If you’re confused about how to get started, check out the hints on fine-grained locking—but try to design your own solution first!

Phase 3: Blocking

Your solution should block, not poll, if a lock cannot be acquired. You may have already implemented blocking if you got some ideas from hints on fine-grained locking.

Look at CPU time on ./ftxrocket -J2 to check your work. If CPU time is far below real time, you’ve got a blocking solution.

$ ./ftxrocket -J2  # coarse-grained lock, polling
4 threads, 400000 operations, 6.487076s CPU time, 2.028060s real time
$ ./ftxrocket -J2  # coarse-grained lock, blocking
4 threads, 400000 operations, 0.684149s CPU time, 2.450396s real time

$ ./ftxrocket -J2  # finest-grained locks, polling
4 threads, 400000 operations, 1.720757s CPU time, 1.819443s real time
$ ./ftxrocket -J2  # finest-grained locks, blocking
4 threads, 400000 operations, 1.066815s CPU time, 1.828140s real time

(In Docker, all these times will be much higher.) Our finest-grained solution used a std::condition_variable_any to implement blocking.

When your fined-grained locking implementation uses blocking and is correct for ./ftxrocket -J2, move on to the next phase.

Phase 4: File operation locks

You’ve now fixed the operation of the exchange—nice job! Unfortunately, your IO61 library is still not thread-safe for normal use (rather than special use with small-file positioned I/O).

You can see this by running ./ftxblockchain. This version of the exchange modifies the accounts database, but as it runs, it also creates an immutable public ledger /tmp/ledger.fdb of all transactions as they happen. For instance:

Rose        -87
Camila      +87
Crew        -89
Jasmine     +89
Adrian      -93
Elena       +93
Madelyn    -107
Marley     +107
David       -85
Malia       +85
Journee     -95
Hazel       +95
Eli        -103
Aurora     +103
...

This ledger is implemented with normal io61_write system calls:

char report[128];
size_t n = snprintf(report, sizeof(report),
                    "%-7s %+7ld\n%-7s %+7ld\n",
                    name1, -delta, name2, +delta);
assert(n == db.asize * 2 && n < sizeof(report));
ssize_t np = io61_write(ledgerf, report, n);
assert(np == ssize_t(n));

But since io61_write is not properly synchronized, running ./ftxblockchain with multiple threads will cause data races and output garbage to /tmp/ledger.fdb, such as:

...
Zoe        2623
Zoey       9946
Zuri       6531
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^A^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@p^K^^@<EC><DF>^?^@^@<E0>^K^@<EC><DF>^?^@^@<F0>^K^@<EC><DF>^?^@^@ ^@^@^@^@^@^@^@^^^@^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^@^@^@^@^@^D^@^@^@^H^@^@^@^@^@^@^@^B^@^@^@^@<B0>^B<86><D0>U^@^@<F0><AF>^B<86><D0>U^@^@^@^@^@^@^@^@^@^@A^@^@^@^@^@^@^@<F0>:؄<D0>U^@^@^B^@^@^@^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@1^@^@^@^@^@^@^@0<8F>^B<86><D0>U^@^@^@^B^@^@^@^@^@^@^P^@^@^@^@^@^@^@^H^@^@^@^@^@^@^@^G^@^@^@^@^@^@^@<C1> ^@^@^@^@^@^@^D^@^@^@^A^@^@^@^AMax         -94
Gemma       +94
Israel      -97
Jude        +97
...

In Phase 4, you should add synchronization to all IO61 methods that read and write data, so that ./ftxblockchain runs safely and without data races.

We recommend coarse-grained synchronization: all methods that read or write data should acquire exclusive access to the whole io61_file structure. Since the normal reading and writing methods (io61_read, io61_write, io61_flush, io61_seek) all read and write the position pos_tag, there is no point in implementing fine-grained synchronization. The positioned methods io61_pread and io61_pwrite do not access the shared pos_tag; nevertheless, in our experience, implementing fine-grained synchronization for io61_pread and io61_pwrite is not worthwhile for the benchmarks in this problem set.

Note that the io61_fdopen and io61_close functions do not require synchronization.

When you have completed this phase, ./ftxblockchain should complete without error, even under sanitizers, and ./diff-ftxdb.pl -l should report no errors.

$ ./ftxblockchain && ./diff-ftxdb.pl -l
4 threads, 400000 operations, 19.116283s CPU time, 16.751274s real time
/tmp/newaccounts.fdb and /tmp/ledger.fdb OK

Furthermore, ./ftxxfer should be able to handle account databases that don’t fit entirely in the io61_file cache—though it may be slow.

$ ./ftxxfer bigaccounts.fdb && ./diff-ftxdb.pl bigaccounts.fdb
4 threads, 400000 operations, 5.065556s CPU time, 10.230461s real time
/tmp/newaccounts.fdb OK

Run make check to check your work.

Troubleshooting. If, after adding synchronization, the test programs appear to hang forever, it’s likely you’ve got a deadlock!

To debug a deadlock, run the program under gdb. When you think the deadlock has happened, press Ctrl-C to interrupt the program. Then the gdb command info threads will report how many threads there are. Use thread 1; bt to look at thread 1’s backtrace, thread 2; bt to look at thread 2’s backtrace, and so forth. The deadlock can be found by looking at the blocked threads’ backtraces, which show which functions are blocked on which mutexes.

The io61_flush function is a common source of deadlock, because it can be called internally, from other IO61 functions, as well as externally, from user code. You may want to introduce an internal function, with a name like __io61_flush or locked_io61_flush, that performs a flush under the assumption that its caller has already locked the io61_file.

Positioned I/O

Normal file descriptor system calls like read and write use and modify the kernel-maintained file position. Other system calls, however, act on specific offsets and ignore the file position. These positioned I/O system calls are especially useful when accessing a complex shared file, like a database, since one system call can do the work of two—and since different threads (or processes) can access different parts of the file independently, without getting in one another’s way.

The positioned I/O system calls are:

ssize_t pread(int fd, void* buf, size_t sz, off_t off)

Read up to sz bytes from file descriptor fd into buffer buf, starting at file offset off. Does not affect the file position. Returns -1 on error and 0 on end of file; otherwise returns the number of bytes read, which (as with read) might be less than sz.

ssize_t pwrite(int fd, const void* buf, size_t sz, off_t off)

Write up to sz bytes to file descriptor fd from buffer buf, starting at file offset off. Does not affect the file position. Returns -1 on error, or the number of bytes written.

Extra credit

Implement shared locks (LOCK_SH) in your IO61 library, and write a test demonstrating that they work correctly.

Turnin

You will turn in your code by pushing your git repository to github.com and updating the grading server with your repository.

Don’t forget to fill out README.md and AUTHORS.md.


This pset was originally created for CS61.