Datarep Section 1: Motel Friend

AT-HOME PREWORK

Please complete the first part of Datarep Section 0 before section!

A very long motel

A very long motel

Modified ChatGPT generation

In this section, you’ll help build a reservations agent for the CS 61 Motel, the longest motel in the world.

How section works

College students are required to attend section live. TFs will record your attendance. For more on the attendance policy and what to do if you miss section refer to the syllabus here and here.

Extension students are welcome to attend section live (in-person or zoom) or watch the recording (Canvas > DCE Class Recordings). Both methods count towards your participation grade. If you attend a live section, TFs will record your attendance. If you watch the recording, fill out this reflection form.

We release section material via the CS 61 course site (where you’re reading this now) and the cs61-sections repository (obtain a local copy with a command like git clone git@github.com:cs61/cs61-sections). TFs will work through the section material at their own pace, stopping at exercises, questions, and discussions so students can work through the material together. It’s best if you bring your laptop. Solutions for most section problems are available here, but you will learn more if you try to solve the problems yourself first.

Section works well if students ask questions, and sometimes TFs will take the section in an unexpected direction based on student requests. Please speak up! If you have questions from lecture, now’s the time to ask.

Students will be expected to have digested all material from section handouts, including offline exercises. Offline exercises often contain hints for problem sets.

The Motel Friend problem

The CS 61 Motel has 18,446,744,073,709,551,615 rooms, numbered 1 through 18,446,744,073,709,551,615. (The office is room 0.) The rooms are so amazing that no one ever checks out. Nevertheless, the front desk attendant, Eddie, is having a hell of a time checking people in to unoccupied rooms. Eddie needs you to write Motel Friend, a computer program to help him.

Your Motel Friend program should have two functions, which should behave as follows:

bool check_in(unsigned long room);
    // Check in a guest to room number `room`.
    // Return false if `room` was previously occupied, true otherwise.

unsigned long empty_room();
    // Return the number of a currently-unoccupied room.
    // Since the hotel is very long, and the office is located in “room” 0,
    // it’s nicest to return the *minimum* unoccupied room number.

These functions will access a global data structure that you define.

(On 64-bit machines, the unsigned long data type is a 64-bit unsigned integer type, and can represent numbers from 0 to 18,446,744,073,709,551,615. More information.)

There is also a bool check_rep() function that is meant to validate that the global data structure is correct. You don’t need to implement it for section, but such functions are super useful to implement anyway.

Question. Think about how would you implement these functions. What data structures might you use? What factors might affect that choice?

Test harness

In the rest of section, you’ll work on different implementations for the Motel Friend functions. We’ve provided a simple test harness so you can run tests. Put your code in cs61-sections/s01-motelfriend/motelfriend.cc. Then you can run make in cs61-sections/s01-motelfriend to get a ./motelfriend program. This program reads a test file and runs the corresponding tests. You can run:

And you can make your own tests! Check out the test files and you should be able to figure out the format. For full test functionality, you’ll want to implement check_map_size and check_rep as well as check_in and empty_room.

A. Motel friend map

In the rest of this section, we’ll implement Motel Friend using std::map. Here’s our proposed type for the global data structure:

std::map<unsigned long, bool> room_map;
using room_iter = std::map<unsigned long, bool>::iterator; // less typing later

room_map.contains(r) is true if and only if room number r is occupied.

Exercise. Implement check_in(unsigned long room) in motelfriend.cc. Test your work (on test01.txt at least).

bool check_in(unsigned long room) {
    // check for double booking
    if (room_map.contains(room)) {
        return false;
    }
    // if no double booking, insert
    room_map.insert({room, true});
    return true;
}

Or, more concisely:

bool check_in(unsigned long room) {
    auto [iterator, inserted] = room_map.insert({room, true});
    return inserted;
}

Discussion. It’s often the case that some arguments to a function represent errors. We say that the function has preconditions—statements that must be true of the input and/or the system’s state for the function to work. Does check_in have any preconditions? Would you recommend changing check_in to validate these preconditions, if any?

The main precondition for check_in is that room should not be 0! Room number 0 corresponds to the office, not a real room, and it shouldn’t be possible to check in to it.

As for how to validate this precondition, you might:

  1. Assert the precondition: assert(room != 0);. This will crash the program if a caller ever provides an invalid argument.

    Assertions are an excellent programming practice and good code will often contain a lot of assertions. Among the advantages of assertions are (1) the program crashes immediately when the assertion is violated, rather than behaving in some unpredictable manner for a while—and immediate crashes facilitate debugging; (2) the assertion documents your understanding of what should be true, so if the assertion fails, you know you must rethink.

  2. Check the precondition and return something sensible if it is violated. For instance, here, the function could behave as if the room were already occupied: if (room == 0) return false;

  3. Do nothing! If the caller violates the precondition, that’s their fault, and they deserve whatever they get.

All of these approaches have pros and cons. Discuss them!

Exercise. Implement empty_room() and test your work.

unsigned long empty_room() {
    unsigned long room = 1;
    while (room_map.contains(room)) {
        ++room;
    }
    return room;
}

Discussion. What are some advantages and disadvantages of this implementation? Think about ease of programming, correctness, and efficiency in terms of time and space.

B. Room ranges

One idea for reducing room_map’s space requirement, and speeding up empty_room, is to store ranges of occupied rooms in the map, rather than a separate entry per room. Then, a single map element could store the fact that rooms 1–10,000 are all occupied. Assuming that rooms are generally occupied sequentially, or in groups, this will reduce both space and time.

Here’s the data structure we’re going to use:

struct room_block {
    unsigned long count; // number of occupied rooms in this block
};
std::map<unsigned long, room_block> room_map;
using room_iter = std::map<unsigned long, room_block>::iterator;

Question. Say that it is an iterator into room_map. Then it points at an occupied room block. How would you extract the first room number in the block from it? How about the last room number in the block (the last occupied room)? The number of booked rooms in the block?

  1. it->first is the first room number in the block.
  2. it->first + it->second.count - 1 is the last room number in the block.
  3. it->second.count is the number of booked rooms.

Unlike the simpler map in Part A, this data structure can represent the same motel state in many different ways. For example, any of these code sequences could represent a motel where rooms 1-10 and 12-15 are all booked:

// First sequence
room_map.insert({1, { 1 }});
room_map.insert({2, { 1 }});
room_map.insert({3, { 1 }});
room_map.insert({4, { 1 }});
room_map.insert({5, { 4 }});
room_map.insert({9, { 2 }});
room_map.insert({12, { 2 }});
room_map.insert({14, { 2 }});

// Second sequence
room_map.insert({1, { 10 }});
room_map.insert({12, { 4 }});

// Third sequence
room_map.insert({1, { 1 }});
room_map.insert({2, { 1 }});
room_map.insert({3, { 1 }});
room_map.insert({4, { 1 }});
room_map.insert({5, { 1 }});
room_map.insert({6, { 1 }});
room_map.insert({7, { 1 }});
room_map.insert({8, { 1 }});
room_map.insert({9, { 1 }});
room_map.insert({10, { 1 }});
room_map.insert({12, { 1 }});
room_map.insert({13, { 1 }});
room_map.insert({14, { 1 }});
room_map.insert({15, { 1 }});

The third of these sequences behaves just like the original code, and has the same space requirements and efficiency issues—but, just like the other two, it represents the booked hotel rooms correctly.

The room_map data structure can also represent room blocks in yucky ways. Here’s an example; it represents the same motel state:

// Fourth sequence
room_map.insert({1, { 8 }});   // covers 1-8
room_map.insert({3, { 8 }});   // covers 3-10: overlaps with previous block!
room_map.insert({12, { 4 }});  // covers 12-15
room_map.insert({13, { 2 }});  // covers 13-14: contained within previous block!
room_map.insert({14, { 1 }});  // covers 14-14: contained within previous block!
room_map.insert({100, { 0 }}); // books no rooms!

Overlapping and empty room blocks are counterintuitive (each room should be booked at most once, and every booking should have at least one room), and it’s hard to write correct code for counterintuitive data structures. We’ll therefore work with the class of non-yucky room_maps, which we’ll call normalized since that sounds more official. A normalized room_map has no overlapping room blocks and no empty room blocks. Our code will always ensure that room_map remains normalized.

Normalized maps obey a set of invariants, which are well-defined properties of a program’s state that must always hold. A function may break an invariant only if it restores the invariant before returning. Invariants differ from preconditions because a precondition must hold only before a function is called, whereas invariants are supposed to hold always. Invariants are a super useful way to think about data structures. We can write invariants in human language, but it’s super useful to write invariants using computer code, such as “it->first != 0 for any iterator it into the map”.

Exercise. Write the invariants that hold for normalized maps. Use computer code to ensure specificity.

We came up with four. The first two are pretty obvious from the definition of normalized map. The second two follow from the meaning of the room_map data structure, and should hold for any room_map, whether or not the map is normalized.

  1. No empty room blocks: it->second.count > 0 for all iterators into the map.

  2. No overlapping room blocks: if it1 and it2 are iterators into the map, and it1 < it2 (so it1->first < it2->first), then it1->first + it1->second.count <= it2->first.

  3. All room blocks are within the hotel: ULONG_MAX - it->first + 1 >= it->second.count for all iterators into the map.

  4. Room 0 is not booked: it->first != 0 for all iterators into the map.

Exercise. Write a check_in() function for normalized room_maps. Your function need not coalesce adjacent room blocks. It should, though, detect whether a requested room is in the middle of an already-booked room block.

First, don’t worry about efficiency—just use an iterator to visit every room block in the map, checking each for overlap with room. Our solution checks every block.

bool check_in(unsigned long room) {
    // check for double booking
    for (auto it = room_map.begin(); it != room_map.end(); ++it) {
        if (it->first <= room && room <= it->first + it->second.count - 1) {
            return false;
        }
    }
    // if no double booking, insert
    room_map.insert({ room, { 1 } });
    return true;
}

Offline Exercise. std::map is an ordered container: iterators into std::map are guaranteed to visit the elements in increasing order by key. Can you use that fact to visit fewer blocks?

Yes! We can ignore the room blocks that start after room.

bool check_in(unsigned long room) {
    // check for double booking
    for (auto it = room_map.begin();
         it != room_map.end() && it->first <= room;
         ++it) {
        if (room <= it->first + it->second.count - 1) {
            return false;
        }
    }
    // if no double booking, insert
    room_map.insert({ room, { 1 } });
    return true;
}

Offline Exercise. This is more efficient than our initial version, but we can do better still, using the convenient room_map.lower_bound and/or room_map.upper_bound functions. Given a key, these functions return the nearest iterator in the map to that key. lower_bound returns the minimum iterator that is greater than or equal to the key, while upper_bound returns the minimum iterator that is greater than the key. Both functions might return map.end(). Rewrite your check_in function to use one of these functions.

bool check_in(unsigned long room) {
    // check for double booking
    auto it = room_map.upper_bound(room); // the *next* room block
    if (it != room_map.begin()) {
        --it; // now, `it` points at the unique room block that might contain `room`
        if (it->first <= room && room <= it->first + it->second.count - 1) {
            return false;
        }
    }
    // if no double booking, insert
    room_map.insert({ room, { 1 } });
    return true;
}

C. Maximal coalescing

Some room_map states have an additional property we will call maximally coalesced. A map is maximally coalesced when it is normalized, but in addition, no occupied room blocks are adjacent (so if rooms 1–10 are occupied, they must be covered by a single room_block). Maximally coalesced maps take less space than other kinds of maps, and operations on them run faster.

Question. Which of the room block code sequences in Part B produces a maximally coalesced map?

The second.

Offline Exercise. Write an empty_room() function that works correctly given the precondition that room_map is maximally coalesced. Explain why it works.

unsigned long empty_room() {
    auto it = room_map.begin();
    if (it == room_map.end() || it->first > 1) {
        return 1;
    }
    return it->first + it->second.count;
}

This works because (1) the std::map::begin iterator points at the entry with the lowest key (for us, the lowest room number), and (2) the map is maximally coalesced. The if clause checks whether the first room block covers room 1; if it doesn’t, then 1 is the lowest-numbered available room. But if the first room block does cover room 1, then we know the room immediately following that block must be available. If it weren’t, the map would not be maximally coalesced!

Offline Exercise. Write an empty_room() function that works correctly for any room_map, normalized or not.

#include <algorithm>

unsigned long empty_room() {
    // `test_room` is the lowest-numbered room that might be unoccupied
    unsigned long test_room = 1;
    // traverse occupied room blocks in order
    auto it = room_map.begin();
    while (it != room_map.end() && test_room >= it->first) {
        test_room = std::max(test_room, it->first + it->second.count);
        ++it;
    }
    return test_room;
}

Now, let’s modify your check_in function so that room_map stays maximally coalesced.

Question. Assume that room_map is initially maximally coalesced, and then your check_in code above adds a single room block to it (without coalescing). Describe the ways that the new room_map might violate maximal coalescing.

The new room_map might contain up to two adjacent room blocks, when it should have zero. If check_in modifies the map, then we know that it inserts a single one-room block. There are exactly four ways this could happen, depending on whether the new block is adjacent to the block before it and/or the block after it:

Input state: ... [[ old block ]]                  [[ old block ]]
Neither:     ... [[ old block ]]     [[ 1 ]]      [[ old block ]] ...
Before:      ... [[    old block   ]][[ 1 ]]      [[ old block ]] ...  (one adjacent block)
After:       ... [[ old block ]]     [[ 1 ]][[    old block    ]] ...  (one adjacent block)
Both:        ... [[    old block   ]][[ 1 ]][[    old block    ]] ...  (two adjacent blocks)

Question. Keeping that in mind, discuss, in words, how your check_in function could re-establish the maximal coalescing property after inserting a new room block.

Seems not too bad? We look on either side of the newly introduced room block, and if there is an adjacent block, we combine them into one. That means we add the later block’s count to the earlier block and then remove the later block. (Maps let us change values, but not keys, so we can’t change a block’s starting room.) We have to be careful, though, to stay within the room_map! It’s illegal to query a “room block” that doesn’t exist, because it’s before the first room block or after the last.

Exercise. Modify your check_in function so that it preserves maximal coalescing.

Hint: We found it convenient to use the standard functions std::next(it) and std::prev(it), which return the iterator immediately after it and the iterator immediately before it, respectively. (So std::next(it) behaves like auto next = it; ++next; return next.)

Note: This is the most complex code in the online exercises, so if you run out of time in section, don’t bother creating the solution yourself: just look at the solution and then discuss it. The goal of section is that you understand this solution.

#include <iterator>

bool check_in(unsigned long room) {
    ... double booking check from Part B ...

    auto [ newit, inserted ] = room_map.insert({ room, { 1 } });
    // we definitely inserted
    assert(inserted);
    // we might need to coalesce after
    if (std::next(newit) != room_map.end()) {
        auto nextit = std::next(newit);
        if (nextit->first == room + 1) {
            newit->second.count += nextit->second.count;
            room_map.erase(nextit);
        }
    }
    // we might need to coalesce before
    if (newit != room_map.begin()) {
        auto previt = std::prev(newit);
        if (previt->first + previt->second.count == room) {
            previt->second.count += newit->second.count;
            room_map.erase(newit);
        }
    }
    return true;
}

Discussion. How does the Motel Friend problem relate to Problem Set 1?

Offline Exercise. Implement bool check_out(unsigned long room), which is the inverse of check_in. This will require splitting room blocks, which is the inverse of coalescing room blocks! 🙂 And that also relates to Problem Set 1…