Problem set 6

In this problem set, you’ll implement synchronization for your buffered I/O library.

Due Saturday 12/10 at 11:59pm for college students.
No submissions beyond Tuesday 12/13 at 11:59pm will be accepted. This is the DCE deadline.

Get the code

Get our code with

$ git pull; git pull handout main

or, alternately

$ git pull; git pull https://github.com/cs61/cs61-f22-psets.git main

This will merge our Problem Set 6 code with your previous work. If you have any “conflicts” from prior problem sets, resolve them before continuing further. Run git push to save your work back to your personal repository.

You may also create a new cs61-psets repository for this assignment. Don’t forget to enter your repository URL on the grading server.

Goal

Your IO61 library from Problem Set 4 works great (hopefully) for single-threaded processes, congratulations! Unfortunately, it does not work great for multi-threaded processes.

The normal stdio library is thread-safe: threads can access the same file simultaneously without causing undefined behavior or other weird errors. But if two threads in the same process try to read from or write to the same io61_file, those threads will cause data races and undefined behavior.

In this problem set, we provide you with a new IO61 library—a single-slot cache—that you will modify to:

Add support for file range locks, or “flocks” for short. A flock allows a thread to lock a specific range of bytes within a file.
Implement thread safety for all operations.

Crypto crypto crypto

Famous billionaire Scam Bilkman-Fleed runs a high-frequency crypto trading exchange. Our handout code contains accounts.fdb, an example of the kind of account database used in his exchange:

SBF     1002034
Alameda  291573
FTX       52180
Aaliyah    3193
Aaron       650
Abel       3351
Abigail    2510
Abraham    5635
Ace        8555
...

Each line contains the account owner’s name, one or more spaces, and then the balance, and must be exactly 16 characters long (including the terminating newline).

Scam’s trading platform runs in a single multithreaded process. (See ftxunlocked.cc.) To make a trade, a worker thread:

Picks two accounts to trade using a proprietary Liquidity Engine (i.e., a random number generator).
Reads the accounts’ current balances using positioned I/O calls in the IO61 library.
Transfers money from one account to the other.
Writes the accounts’ new balances back to the underlying file, again using positioned I/O.

Then it repeats. All of these trades just move money between accounts, so the total amount of money in the exchange should remain constant.

Unfortunately, ftxunlocked.cc is unsynchronized and doesn’t work. You can see this by running ./ftxunlocked; ./diff-ftxdb.pl. The ./diff-ftxdb.pl script checks that two account files (by default accounts.fdb, the original database, and /tmp/newaccounts.fdb, the modified database) have the same accounts and the same total assets. ./ftxunlocked violates this assumption:

$ ./diff-ftxdb.pl
[0;31m/tmp/newaccounts.fdb:[01;31m Incorrect exchange total 4017050[m
[0;31maccounts.fdb: Expected 4000000[m

Your work

All your code for this pset will go in io61.cc, though you should look at some of the other files to see what’s going on—especially ftxdb.hh and ftxxfer.cc.

In Phases 1–2, you will change the struct io61_file definition and the io61_lock, io61_try_lock, and io61_unlock functions. In Phase 3, you will change many of the IO61 functions.

Phase 1: Coarse-grained file range locks

ftxxfer.cc attempts to solve this problem by locking accounts before making transfers. These locks are implemented as file range locks, which are like mutexes for a specific range of offsets in a file. File range locks use these function calls:

int io61_try_lock(io61_file* f, off_t start, off_t len, int locktype): Try to obtain a range lock for offsets [start, start + len) of file f. Returns 0 if the range lock was obtained, and -1 if it could not be obtained. This function does not block; if a range lock is already held overlapping that range, it returns -1 right away. The locktype argument should be LOCK_EX to obtain an exclusive lock.
int io61_lock(io61_file* f, off_t start, off_t len, int locktype): Like io61_try_lock, but blocks until the range lock is obtained and then returns 0.
int io61_unlock(io61_file* f, off_t start, off_t len): Unlock the current thread’s range lock for offsets [start, start + len) of file f and returns 0. It is undefined behavior if this thread doesn’t hold a lock on that specific range.

But these functions aren’t implemented—they don’t actually lock anything!

In Phase 1, implement these functions using a single std::recursive_mutex per file. This should make ./ftxxfer; ./diff-ftxdb.pl report no errors—though it may take a long time, especially in Docker.

$ ./ftxxfer && ./diff-ftxdb.pl
4 threads, 400000 operations, 0.681209s CPU time, 2.484973s real time
[01;32m/tmp/newaccounts.fdb OK[m
$ ../cs61-run-docker
cs61-user@aea282d44262:~/cs61-psets/pset6$ make clean
cs61-user@aea282d44262:~/cs61-psets/pset6$ make && ./ftxxfer && ./diff-ftxdb.pl
...
4 threads, 400000 operations, 6.199521s CPU time, 44.704464s real time
[01;32m/tmp/newaccounts.fdb OK[m

A std::recursive_mutex is required because ftxxfer.cc locks both accounts involved in the transfer.

Test your work with sanitizers (make SAN=1) to check for data races. You will also need to fix one data race in io61_pwrite, on the line that sets f->dirty = true; you can use the same std::recursive_mutex for this, or, even simpler, give the dirty member an atomic type.

Notes on file range locks

File range locks are advisory, as opposed to mandatory, with respect to the underlying file data. This means that holding a lock on a range of offsets does not prevent other threads from reading or writing that range! For instance, a rogue thread can io61_write a locked range without acquiring the lock. File range locks provide correct synchronization only when all threads use them. This resembles the way that std::mutex works. Locking a mutex does not prevent other threads from accessing the protected data—synchronization is correct only when all threads use the mutex correctly.

The start and len arguments must be ≥0. If len == 0, the requested range is empty, and any attempt to lock or unlock the range succeeds (returns 0) without making any changes.

Actual operating systems support process-based file locks and file range locks with a similar API. That is, at most one process can acquire a lock on a particular range. On Unix variants, flock manages file locks and fcntl’s F_SETLK/F_SETLKW command manages file range locks; on Windows, LockFile manages file and file range locks. Unix’s file locks are purely advisory (normal read and write system calls ignore locks). In Windows, file locks are semi-advisory: they prevent normal reads and writes, but not mmap.

Phase 2a: Fine-grained file range locks

Your coarse-grained file range lock implementation provides correctness, but not performance. In logical terms, it’s fine for a thread to transfer funds between Alice and Adonis in parallel with another thread transferring funds between Hendrix and Malakai: the relevant accounts are disjoint.

In Phase 2, replace your coarse-grained file range lock implementation with finer-grained locks. This should speed up the exchange; for example:

$ ./ftxxfer  # coarse-grained lock
4 threads, 400000 operations, 0.681209s CPU time, 2.484973s real time
$ ./ftxxfer  # finer-grained locks
4 threads, 400000 operations, 0.519218s CPU time, 0.753223s real time

Improvements are visible for ftxxfer, which trades among randomly-selected accounts. Fine-grained synchronization becomes even more important, though, when some accounts or objects are busier than others. The ftxrocket.cc program trades more often with certain important accounts. (Run ./ftxrocket and look at /tmp/newaccounts.fdb to see how big crypto exchanges work in reality.) With ./ftxrocket, fine grained synchronization matters even more:

$ ./ftxrocket  # coarse-grained lock
4 threads, 400000 operations, 0.674707s CPU time, 2.465647s real time
$ ./ftxrocket  # finer-grained locks
4 threads, 400000 operations, 0.538521s CPU time, 1.288776s real time
$ ./ftxrocket  # finest-grained locks
4 threads, 400000 operations, 0.629610s CPU time, 0.799237s real time

Implementation notes

There are several ways to implement finer-grained locking. For example, you can divide the file into multiple regions (say, 128 regions per file), and lock just the regions that overlap the lock request. Or you could add an auxiliary data structure to track currently-locked ranges. You could have one mutex per region as different mutexes, or instead use a single mutex in io61_file plus one or more condition variables.

As you implement your design, be aware that mutexes and condition variables are special objects and cannot be used in some ways. Your io61_file may contain an array of mutexes (as std::mutex ms[1024];), or a dynamically-allocated array of mutexes (as std::mutex* ms; ... f->ms = new std::mutex[nmutexes];), or a list or map of mutexes or data structures containing mutexes (as std::list<std::mutex> mlist;). It may not, however, have a std::vector of mutexes or of any structures containing one or more mutexes or condition variables. This is because mutexes and condition variables cannot be copied or moved to different addresses once they are created, and vectors inherently copy their contents when they grow.

The ftx test programs always obtain locks on 16-byte-aligned units of 16 bytes. Your locking design may work fastest for such lock requests, but it must provide correct mutual exclusion for any lock requests—including overlapping lock requests. (That is, if a thread has an exclusive range lock on offsets [12,27) in file f, then io61_try_lock(f, 25, 40) must return -1 because the requested range [25,85) overlaps with the locked range [12,27).)

If you’re confused about how to get started, check out the hints on fine-grained locking—but try to design your own solution first!

Phase 2b: Blocking

Your solution should block, not poll, if a lock cannot be acquired. Look at CPU time on ./ftxrocket -J2 to check. If CPU time is far below real time, you’ve got a blocking solution.

$ ./ftxrocket -J2  # coarse-grained lock, polling
4 threads, 400000 operations, 6.487076s CPU time, 2.028060s real time
$ ./ftxrocket -J2  # coarse-grained lock, blocking
4 threads, 400000 operations, 0.684149s CPU time, 2.450396s real time

$ ./ftxrocket -J2  # finest-grained locks, polling
4 threads, 400000 operations, 1.720757s CPU time, 1.819443s real time
$ ./ftxrocket -J2  # finest-grained locks, blocking
4 threads, 400000 operations, 1.066815s CPU time, 1.828140s real time

(In Docker, all these times will be much higher.) Our finest-grained solution used a std::condition_variable_any to implement blocking.

Phase 3: File operation locks

You’ve now fixed the operation of the exchange—congratulations! Unfortunately, your IO61 library is still not thread-safe for normal use (rather than special use with small-file positioned I/O).

You can see this by running ./ftxblockchain. This version of the exchange modifies the accounts database, but as it runs, it also creates an immutable public ledger /tmp/ledger.fdb of all transactions as they happen. For instance:

Rose        -87
Camila      +87
Crew        -89
Jasmine     +89
Adrian      -93
Elena       +93
Madelyn    -107
Marley     +107
David       -85
Malia       +85
Journee     -95
Hazel       +95
Eli        -103
Aurora     +103
...

This ledger is implemented with normal io61_write system calls:

pset6/ftxblockchain.cc

char report[128];
size_t n = snprintf(report, sizeof(report),
                    "%-7s %+7ld\n%-7s %+7ld\n",
                    name1, -delta, name2, +delta);
assert(n == db.asize * 2 && n < sizeof(report));
ssize_t np = io61_write(ledgerf, report, n);
assert(np == ssize_t(n));

But since io61_write is not properly synchronized, running ./ftxblockchain with multiple threads will cause data races and output garbage to /tmp/ledger.fdb, such as:

...
Zoe        2623
Zoey       9946
Zuri       6531
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^A^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@p^K^^@<EC><DF>^?^@^@<E0>^K^@<EC><DF>^?^@^@<F0>^K^@<EC><DF>^?^@^@ ^@^@^@^@^@^@^@^^^@^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^@^@^@^@^@^D^@^@^@^H^@^@^@^@^@^@^@^B^@^@^@^@<B0>^B<86><D0>U^@^@<F0><AF>^B<86><D0>U^@^@^@^@^@^@^@^@^@^@A^@^@^@^@^@^@^@<F0>:؄<D0>U^@^@^B^@^@^@^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@1^@^@^@^@^@^@^@0<8F>^B<86><D0>U^@^@^@^B^@^@^@^@^@^@^P^@^@^@^@^@^@^@^H^@^@^@^@^@^@^@^G^@^@^@^@^@^@^@<C1> ^@^@^@^@^@^@^D^@^@^@^A^@^@^@^AMax         -94
Gemma       +94
Israel      -97
Jude        +97
...

In this phase, you should add synchronization to all IO61 methods that read and write data, so that ./ftxblockchain runs safely and without data races.

We recommend coarse-grained synchronization for this phase, where methods that read and write data acquire exclusive access to the whole io61_file structure. The normal reading and writing methods (io61_read, io61_write, io61_flush, io61_seek) modify the cache contents and/or the position pos_tag, thus requiring mutual exclusion. It is possible to design a finer-grained locking plan for the positioned methods (io61_pread, io61_pwrite), but in our experience it is not worthwhile on these benchmarks. (It could be worthwhile on other benchmarks.)

Note that the io61_fdopen and io61_close functions do not require synchronization.

When you have completed this phase, ./ftxblockchain should complete without error, even under sanitizers, and ./diff-ftxdb.pl -l should report no errors.

$ ./ftxblockchain && ./diff-ftxdb.pl -l
4 threads, 400000 operations, 19.116283s CPU time, 16.751274s real time
[01;32m/tmp/newaccounts.fdb and /tmp/ledger.fdb OK[m

Furthermore, ./ftxxfer should be able to handle account databases that don’t fit entirely in the io61_file cache—though it may be slow.

$ ./ftxxfer bigaccounts.fdb && ./diff-ftxdb.pl bigaccounts.fdb
4 threads, 400000 operations, 5.065556s CPU time, 10.230461s real time
[01;32m/tmp/newaccounts.fdb OK[m

Troubleshooting. If, after adding synchronization, the test programs appear to hang forever, it’s likely you’ve got a deadlock!

To debug a deadlock, run the program under gdb. When you think the deadlock has happened, press Ctrl-C to interrupt the program. Then the gdb command info threads will report how many threads there are. Use thread 1; bt to look at thread 1’s backtrace, thread 2; bt to look at thread 2’s backtrace, and so forth. The deadlock can be found by looking at the blocked threads’ backtraces, which show which functions are blocked on which mutexes.

The io61_flush function is a common source of deadlock, because it can be called internally, from other IO61 functions, as well as externally, from user code. You may want to introduce an internal function, with a name like __io61_flush or locked_io61_flush, that performs a flush under the assumption that its caller has already locked the io61_file.

Extra Credit Phase: Your Pset 4

For extra credit, integrate your Pset 4 IO61 library with Pset 6. This will require adding io61_pread, io61_pwrite, and io61_lock and friends to your library!

Positioned I/O

Normal file descriptor system calls like read and write use and modify the kernel-maintained file position. Other system calls, however, act on specific offsets and ignore the file position. These positioned I/O system calls are especially useful when accessing a complex shared file, like a database, since one system call can do the work of two—and since different threads (or processes) can access different parts of the file independently, without getting in one another’s way.

The positioned I/O system calls are:

ssize_t pread(int fd, void* buf, size_t sz, off_t off): Read up to sz bytes from file descriptor fd into buffer buf, starting at file offset off. Does not affect the file position. Returns -1 on error and 0 on end of file; otherwise returns the number of bytes read, which (as with read) might be less than sz.
ssize_t pwrite(int fd, const void* buf, size_t sz, off_t off): Write up to sz bytes to file descriptor fd from buffer buf, starting at file offset off. Does not affect the file position. Returns -1 on error, or the number of bytes written.

Turnin

You will turn in your code by pushing your git repository to github.com and updating the grading server with your repository.

Don’t forget to fill out README.md and AUTHORS.md.

This pset was originally created for CS61.

Problem set 6: Flockchain