Overview
In this last lecture, we build a correctly-synchronized networked database.
Key-value database
- A map that’s connected to the network
 - Operations: 
get,set,deleteget(K): Return the value stored for keyKset(K, V): Change the value stored for keyKtoVdelete(K): Delete any value stored for keyK
 - Example: 
memcached- At Facebook in 2013, “loading one of our popular pages results in an average of 521 distinct items fetched from 
memcache. … We provision hundreds ofmemcachedservers in a cluster” [ref] 
 - At Facebook in 2013, “loading one of our popular pages results in an average of 521 distinct items fetched from 
 
Network API
- A network connection is a streaming connection, like a pipe, between two different computers
 - Because the “pipe” is between unrelated computers, there’s no common ancestor—no “parent process” to inherit the pipe from
 
Clients and servers
- Server
- A program that opens a listening port that can accept new connections
 - Passive
 
 - Client
- A program that opens a new connection to a preexisting listening port on some server
 - Active
 
 
WeensyDB server loop
int fd = open_listen_socket(port);
// `fd` doesn’t receive *data*—it receives *new connections*
while (true) {
    struct sockaddr addr;
    socklen_t addrlen = sizeof(addr);
    // Accept connection on listening socket
    int cfd = accept(fd, &addr, &addrlen);
    // `cfd` is a new connection from the computer at `addr`
    // Handle connection
    handle_connection(cfd, unparse_sockaddr(&addr, addrlen));
    // `handle_connection` closes `cfd` when done
}
WeensyDB internals
- Hash table
 - Hash function: bucket number = first character of key
 
WeensyDB protocol
./weensydb01starts servertelnet localhost 6162(ortelnet cs61.seas.harvard.edu 6162)
kohler@unshare$ telnet cs61 6162
Trying 34.193.34.22...
Connected to cs61.seas.harvard.edu.
Escape character is '^]'.
$get hello
END
$set hello 3          [3 = size of value]
$Hi
STORED #1
$get hello
VALUE hello 3 #1
Hi
END
$delete hello
DELETED #1
$set hello 9
$Hi again
STORED #2
$quit
Connection closed by foreign host.
Synchronization issues
- Parallelism
- Only one client can access the database at a time
 
 - Data races
- Multiple threads can conflict on concurrent memory access
 
 - Lock granularity
- Unnecessarily strict locking reduces parallelism
 
 - Denial of service
- A misbehaving client can block out all other clients indefinitely
 
 - Deadlock
- Circular locking: a client block itself (and all other clients)
 - Lock ordering
 
 - Resource exhaustion
- Too many clients can crash the server
 
 
dbmap API
db.bucket(KEY)- Return bucket number containing 
KEY 
- Return bucket number containing 
 db.bbegin(B),db.bend(B)- Iterators to beginning and end of bucket 
B 
- Iterators to beginning and end of bucket 
 db.bfind(B, KEY)- Return iterator to item with key 
KEY 
- Return iterator to item with key 
 db.binsert(B, KEY)- Insert new item with key 
KEY, return iterator to it 
- Insert new item with key 
 db.berase(B, IT)- Erase item 
*IT 
- Erase item 
 
Summary
- weensydb01
can handle only one connection at a time
- Put each connection in its own thread
 
 - weensydb02
has data races
- Add a coarse-grained mutex protecting the database
 
 - weensydb03
has coarse-grained locking and therefore little parallelism between connections
- Implement fine-grained synchronization: one mutex per hash bucket
 
 - weensydb04
is vulnerable to a (perhaps unintentional) denial-of-service attack: if a connection doesn’t
supply a value in 
set, then that thread will block forever with the corresponding mutex locked- Don’t block with a mutex acquired if possible
 - Read the value before acquiring the mutex
 
 - weensydb05
is vulnerable to a (perhaps unintentional) denial-of-service attack: if a connection doesn’t
supply a value in 
set, then that thread will block forever with the corresponding mutex locked- Don’t block with a mutex acquired if possible
 - Read the value before acquiring the mutex
 - Also applies to writing data back to the user
 
 - weensydb06
introduces the 
exchcommand, which swaps two values: it’s vulnerable to deadlock!- Don’t acquire the same mutex twice
 
 - weensydb07 is perfect…?
 
Principles of synchronization
- Atomic operations
- Modern synchronization rests on atomic operations that read and write shared state in one atomic step
 - Example: Atomic increment, decrement
 
 - Data races
- Concurrent unsynchronized access to the same memory is undefined behavior unless all accesses are reads
 
 - Mutual exclusion
- Avoid data races with mutex synchronization objects
 - A mutex protects access to shared state
 - Guard synchronization objects ensure mutual exclusion throughout a scope
 - Relationship between mutex and state is implicit (marked by code)
 
 - Condition variables
- Threads may need to wait for a condition to become true
 - This can cause polling, which is expensive in CPU time
 - Condition variables allow safe blocking on a condition (no lost wakeups)
 - Code must notify cv when condition changes
 
 - Granularity
- Coarse-grained synchronization object covers large state/broad condition
 - Fine-grained synchronization object covers small state/narrow condition
 
 
Fairness
- Who wins the race to acquire a mutex? Is it fair? Can a thread starve?
 - Ticket lock
 
Atomicity and security
- Time-of-check-to-time-of-use (TOCTTOU) vulnerabilities (link)
 
Research!
Next
- 
CS 161: Operating Systems
 - 
CS 165: Data Systems (fall)
 - 
CS 260r: Topics in Computer Systems (this year: parallel and distributed execution models)
 - 
CS 263: Systems Security (fall 2023)
 - 
CS 265: Big Data Systems (spring 2023)
 - 
CS 153: Compilers (fall)
 - 
CS 141: Computing Hardware
 - 
CS 148/248: Design of VLSI Circuits and Systems
 - 
CS 145/245: Cloud Networking and Computing (2022/2023)
 - 
CS 96: System Design Projects: Machine Learning for Social Impact
 - 
CS 124: Data Structures and Algorithms
 - 
CS 143: Networks
 - 
CS 175: Computer Graphics
 - 
CS 181: Machine Learning
 - 
CS 191: Classics of Computer Science
 - 
CS 222: Algorithms at the Ends of the Wire (future)
 - 
CS 286: Multi-Robot Systems: Control, Communication, and Security
 
THANK YOU!
