Systems Programming and Machine Organization
This is the 2012 version of the course. Main site
Assignment 5: Adversarial Network Pong
This assignment will teach you some useful and common strategies for handling problems common in networking, including loss, delay, and low utilization, through a “game” called network pong.
- Assigned Thu 11/29
- Due Wed 12/12 at 11:59pm for college students (1 day later for extension)
- This assignment may be completed in pairs.
Phase 0: Easy
Merge our Assignment 5 code into your repository with
git pull handout master. This will merge our Assignment 4 code with your previous work. If you have any “conflicts” from Assignment 4, resolve them before continuing further. Run
git push to save your work back to code.seas.
Please use an explicit merge to create your repository. If you copy code by hand, our automated scripts will have trouble analyzing your code, and it’ll be harder for you to incorporate our updates.
You may also create a new
cs61-psets repository for this assignment. Tell us if you do.
Once you have merged, edit the
pset5/serverinfo.h file so that
PONG_USER is defined to a unique string that only your team knows.
Note. We currently do not plan to implement authentication on the pong server. This means that, if you learn another team’s string, you can mess with their heads. Please do not abuse this system. Aggressive abuse will be reported to The Authorities.
make, and then run the
It will print out a URL. Visit that URL in your browser. You should see
something like this:
pong61 program draws a bouncing ball in a rectangular
pong61 works by making HTTP RPC requests to a Web
server we run. HTTP is the application protocol on which the Web is
built. Read more about HTTP before you continue.
When it starts up,
pong61 makes a single request to a URL
This tells the server to reset the pong board’s state. The server returns a simple response like this:
These two numbers are the board’s width and height, respectively. It also communicates with your browser to tell it to clear the board.
pong61 makes many requests to URLs like this:
This request causes a new ball to appear at position
YPOS). The server responds with a numeric
code and explanation. If everything goes well, it says:
If there’s a problem, the numeric code is negative, as in:
-1 x and y parameters must be numbers
After each request,
pong61 waits 0.1 seconds before making the next
Our handout code should work just fine on Phase 0. To do the problem set, you must change the code so it works on the other phases too. Use the web page's phase buttons to change phases. But before you go further, take some time to read and understand the handout code.
Phase 1: Loss
In Phase 1, the server starts to lose messages. The server will occasionally go offline for a short period. During that time, every
move request is rejected by closing down the connection. The
http_receive_response function in your
pong61 program sets
HTTP_BROKEN when this happens (and sets
-1). But currently,
pong61 ignores this state and continues as if nothing had happened. The result is blank spaces in the pong trail. Our server indicates your mistake by drawing black marks in these spaces.
Your job in this phase is to detect lost messages and retry. When the server drops a connection, your code should repeat the same connection attempt.
However, you must be careful not to kick the server while it’s down. Do not make an infinite stream of connection attempts. The server will notice this and explode. Instead, you must implement a form of exponential backoff. This is a simple, powerful concept.
- The first time a connection attempt fails, wait for K seconds before trying again. K can be very small (1/1000, even).
- If the retry also fails, wait 2K seconds before trying again.
- If that retry also fails, wait 4K seconds before trying again.
- In general, after N failed retries, wait 2NK seconds before trying again. (Optionally you may want to introduce a maximum backoff; perhaps you would wait min(2NK, 128) seconds before trying again.)
Exponential backoff is awesome because it responds to short outages quickly, but imposes only logarithmic overhead. It’s truly ubiquitous: Ethernet is built on it. And the next time your Gmail goes offline, check out the numbers that appear after “Not connected. Trying again in...”.
Phase 2: Delay
In Phase 2, the server delays its responses. It will send you the full header for its response, but delay the body. Since in the handout code the pong loop waits for one request to complete before sending the next, the pong ball will move extremely slowly in Phase 2. Too slowly, in fact: the server enforces a minimum ball speed, and when your code is slower than that speed, you’ll see some black marks on the display.
You might think solving this problem would be easy: just close the connection after some timeout. But the server is too clever for this. If you close a connection prematurely—before the entire response is sent—the server explodes.
To support delay, your
pong61 must handle multiple concurrent connections to the server. There are several ways you can do this.
- Create new threads. (See §12 of CS:APP2e.)
timestampto ensure that no
readcall blocks for too long. Return prematurely from
http_receive_responseif the response is taking too long to complete. Close the connection later, after it completes.
- Fork child processes. (This is in some ways easiest in Phase 2, but it may complicate your Phase 3 code.)
But watch out. If you hold too many connections open in parallel, the server will explode.
Your Phase 2 code must also work in Phase 1. (A naive implementation of Phase 2 might fail when requests are lost.) We suggest you make Phase 2 work first on its own, then go back and make Phase 1 work again. Note that whenever the server drops a connection, it does so immediately, with no delay.
Troubleshooting and hints
- An error such as
connect: Interrupted system callindicates that
interrupt_afterinterrupted a system call other than
read. You can use
interrupt_cancelto cancel an outstanding interrupt.
pthread_cond_timedwaitlibrary function can be useful if you create new threads.
- If confused or lost, try modifying
http_receive_responseto print out the elapsed time of each
readand the data returned by each
read. Which calls to
readblock for a long time (in Phase 0, Phase 1, and Phase 2)? What distinguishes a delayed response (Phase 2) from a failed response (Phase 1)? Can you tell the difference without blocking for a long time?
Phase 3: Utilization
So far, your
pong61 client opens a new network connection for every ball. This is wasteful and slow and in Phase 3 the server will not allow it. You should instead reuse valid HTTP connections for new ball positions.
http_connection object is available for reuse if and only if
conn->state == HTTP_DONE. This means that the server sent a complete response and is waiting for another request.
Reusing connections would be really easy—except that in Phase 3 the server also drops some connections (as in Phase 1) and delays some connections (as in Phase 2). Your
pong61 client must handle it all.
The key function you’ll need to add is a connection table of available connections. This can be a linked list, an array, or whatever you’d like. When a connection reaches state
HTTP_DONE, add it to the table. When you need to call a new RPC, check the table first, and use that connection if one exists.
Your code for maintaining the connection table will depend on the strategy you used for Phase 2. If you use threads, you must protect the connection table from concurrent access, most likely using locks.
Phase 4: Congestion
In Phase 4, the server sometimes behaves as if it is congested. A congested server responds to a request not with
0 OK, but with a positive number, such as this:
This means that the server is overloaded.
pong61 is not allowed to send any more requests for (in this case) 1948 milliseconds. This should give the server enough time to cool down. The display will show a stop sign during the cool-down period, and if
pong61 ignores the stop request and sends a message anyway, the server will explode. But after the cool-down period, the client should go right back to sending requests.
Phases 1 through 3 are still active in Phase 4.
Phase 5: Evil
Phase 5 is a mystery, but not a very tough one. Run your code on Phase 5 and you’ll figure out the problem soon enough.
We may implement an extra credit phase, let us know if you’re interested. But you can do extra credit anyway by implementing something crazy with the server. For example, implement an actual pong game. Here are some RPCs the server implements that might be useful.
- You can turn off the server’s checking facilities with a
phase/-1RPC. (For instance,
- You can query the state of a cell with a
- You can set a cell to a new state for a given duration with a
move?x=XPOS&y=YPOS&style=STYLE&duration=MILLISECONDSRPC. This only works in phase -1.
- You can set
STYLEto a number of interesting possibilities. See if you can figure out what they are!
You will turn in your code by pushing your git repository to code.seas.harvard.edu. Inform us ASAP if you have changed partner or repository from pset 4.
Remember to fill out
- On the other hand, if two teams actually wanted to mess with one another’s heads....
This pset was created for CS61, Fall 2012.