CS106B Winter '20-21 Practice Midterm


Here’s a collection of practice problems you can work through to prepare for the upcoming midterm exam. These questions are taken from past CS106B midterms, which, it should be noted, were not given in the same format as the upcoming exam. However, we think that they’re nonetheless excellent practice for the upcoming exam.

1) Containers and Flightless Birds

Consider the following functions, dodo and kiwi, which differ only in the highlighted areas:

string dodo(const string& str, int n) {
  Queue<char> storage;

  /* Visit the characters in str in the
   * order in which they appear.
   */
  for (char ch: str) {
    storage.enqueue(ch);
  }

  for (int i = 0; i < n; i++) {
    char ch = storage.dequeue();
    storage.enqueue(ch);
  }

  /* If you don't initialize a string
   * in C++, it defaults to empty.
   */
  string result;
  while (!storage.isEmpty()) {
    result += storage.dequeue();
  }

  return result;
}
string kiwi(const string& str, int n) {
  Stack<char> storage;

  /* Visit the characters in str in the
   * order in which they appear.
   */
  for (char ch: str) {
    storage.push(ch);
  }

  for (int i = 0; i < n; i++) {
    char ch = storage.pop();
    storage.push(ch);
  }

  /* If you don't initialize a string
   * in C++, it defaults to empty.
   */
  string result;
  while (!storage.isEmpty()) {
    result += storage.pop();
  }

  return result;
}

Answer each of the following questions and fill in the appropriate blanks. No other justification is needed.

i. How many strings s are there where dodo(s, 3) returns "penguin"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "________".
  • There are two or more choices of s where this happens, such as "________" and "________".

ii. How many strings s are there where kiwi(s, 3) returns "penguin"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "________".
  • There are two or more choices of s where this happens, such as "________" and "________".

iii. How many choices of n are there for which dodo("kakapo", n) returns "kakapo"?

  • There are no choices of n for which this happens.
  • There is exactly one choice of n for which this happens, namely n = ________.
  • There are two or more choices of n for which this happens, such as ________ and ________.

iv. How many choices of n are there for which kiwi("kakapo", n) returns "kakapo"?

  • There are no choices of n for which this happens.
  • There is exactly one choice of n for which this happens, namely n = ________.
  • There are two or more choices of n for which this happens, such as ________ and ________.

i. How many strings s are there where dodo(s, 3) returns "penguin"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "uinpeng".
  • There are two or more choices of s where this happens, such as "________" and "________".

ii. How many strings s are there where kiwi(s, 3) returns "penguin"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "niugnep".
  • There are two or more choices of s where this happens, such as "________" and "________".

iii. How many choices of n are there for which dodo("kakapo", n) returns "kakapo"?

  • There are no choices of n for which this happens.
  • There is exactly one choice of n for which this happens, namely n = ________.
  • There are two or more choices of n for which this happens, such as n < 0 and any multiple of 6.

iv. How many choices of n are there for which kiwi("kakapo", n) returns "kakapo"?

  • There are no choices of n for which this happens.
  • There is exactly one choice of n for which this happens, namely n = ________.
  • There are two or more choices of n for which this happens, such as ________ and ________.

To explain this, let’s first look at the dodo function. This function enqueues all the characters of the input string, in order, into a queue. It then repeatedly dequeues from the queue and enqueues in the back. This is similar to the Looper example we did in class, which looped a piece of music by repeatedly playing a note, then moving it to the back of a queue. The net effect of this function is to “cycle” the characters of the input string n times by repeatedly moving the front character to the back. When the characters are transferred out of the queue and into the resulting string, they’re removed in the relative order that they appear in the queue, and so the overall function works by cycling the characters of the string n times.

Next, let’s look at kiwi. This function pushes the elements of the string onto a stack. It then repeatedly pops an element from the stack and then pushes it back on. But if you pop off the top of a stack and then push the result back onto the stack, the stack is left unchanged – the element is placed right back where it started! We saw this briefly at the end of the lecture with the Looper demo where we changed the queue to a stack and only heard the last note of the song repeated multiple times. When the elements of the stack are transferred back to the result string, the characters from the input string come back in reverse order, since the most-recently-added element of the stack is the last character of the string. The net effect of this function is that it ignores n entirely and just returns the reverse of the string.

Turning to the questions at hand:

i. How many strings s are there where dodo(s, 3) returns "penguin"?

Any string that does this would have to be something that, if you cycled the characters three times, gives back "penguin". There’s only one string with that property, and that’s "uinpeng".

ii. How many strings s are there where kiwi(s, 3) returns "penguin"?

This has to be a string that, if reversed, gives you "penguin". The only option is "niugnep".

iii. How many choices of n are there for which dodo("kakapo", n) returns "kakapo"?

In other words – for how many values of n can you cycle the characters of "kakapo" n times to get back where you started? One option would be 0, since not cycling anything at all will do the trick. Another would be 6, 12, 18, etc., since any multiple of six will rotate all the characters back where they started.

There’s another, sneakier answer here, and that’s to notice that if n < 0, then the inner for loop never runs and it’s as if we didn’t cycle the string at all. Therefore, any negative number will work here.

iv. How many choices of n are there for which kiwi("kakapo", n) returns "kakapo"?

Since kiwi ignores n and just returns the reverse of the input, the result of this function call will always be the string "opakak", so we’ll never get "kakapo" back.

Why we asked this question: This question was designed to see whether you could look at a piece of code, work through it mechanically, and build up a mental model for how it operates. We specifically wanted to see whether you were comfortable with stacks and queues, hence their appearance here.


2) Building an Index

Larger textbooks or reference books usually include an index, a series of pages in which different terms are listed along with the page numbers on which those terms occur.

How might we represent an index in software? Given the text of the book, our goal is to create an object that’s good at answering this question:

Given a word, return the page numbers of each page containing that word

This problem has two parts.

Part 1: Selecting a Container

First, using the C++ containers we’ve covered so far, tell us which type you think is best-suited for representing an index, given that the index needs to be able to easily answer the above question. (For example, you could say string, Set<double>, Vector<Map<Stack<int>, Queue<char>>>, etc.) Then, briefly justify your answer. Your justification should take at most fifty words; anything longer than that will receive no credit.

Some notes on this problem:

  • Remember that you want to give back the page numbers on which the word occurs, not the pages themselves.
  • If you can think of multiple answers that all sound good, choose only one as your answer to this question. If you list multiple answers, we will not award any points.

The type I would pick is: ____________________.

Here is my justification (at most fifty words):

Part 2: Creating the Index

Your next task is to write a function

YourType makeIndexFor(const Vector<string>& pages, int maxFrequency);

that takes as input the pages of a book, represented as a Vector<string>. That is, pages[0] represents the first page of the book, pages[1] represents the second page of the book, etc. This function then returns the index, represented as an object of the type you mentioned in your answer above. As a reminder, the type you chose should be good at solving this problem:

Given a word, return the page numbers of each page containing that word.

There’s one little detail we need to address: there are many words in a book that we wouldn’t want to include an in index. For example, the word “a” is so common that including it in an index would take up a ton of space without actually contributing anything useful. To account for this, the index you return should only include words that appear on at most maxFrequency pages.

As a reminder, the elements in the Vector are individual pages in the book, while your index should be working at the level of individual words in the book. Feel free to use the tokenize function that we gave you in Assignment 3:

Vector<string> tokenize(const string& str);

Just to make sure you didn’t miss it, your index should only contain words, not spaces, punctuation, numbers, etc. Similarly to Assignment 3, you can assume that a string is a word if its first character is a letter. You can check if a char is a letter by using the isalpha function.

Some notes on this problem:

  • Remember that your goal is to produce an index that tells you the page numbers of each page on which a given word occurs, not the contents of that page.
  • For simplicity’s sake, let’s assume that page numbering starts at Page 0, not Page 1. That makes it easier to use a Vector<string> to represent the pages of the book.
  • For full credit, please use our tokenize function rather than other approaches like the library function stringSplit or the helper type TokenScanner.

We’d recommend Map<string, Set<int>>, which maps a word to a set of all the pages it appears on. Using a Map<string, Vector<int>> is also reasonable. One advantage is that it lets you keep page numbers sorted, but a drawback is that you might record the same page many times.

Here’s a sampler of solutions. First, there’s this solution, which keeps a separate set of “banned” words that appeared too many times. We’ve included two versions, one with each container type.

Map<string, Set<int>> makeIndexFor(const Vector<string>& pages,
                                   int maxFrequency) {
    Map<string, Set<int>> result;
    Set<string> tooFrequent;

    for (int page = 0; page < pages.size(); page++) {
        /* Loop over the words in that page. */
        for (string word: tokenize(pages[page])) {
            /* If this is a word that isn't banned, log it. */
            if (isalpha(word[0]) && !tooFrequent.contains(word)) {
                result[word] += page;

                /* Oops! Too common. */
                if (result[word].size() > maxFrequency) {
                    tooFrequent += word;
                    result.remove(word);
                }
            }
        }
    }

    return result;
}

Map<string, Vector<int>> makeIndexFor(const Vector<string>& pages,
                                      int maxFrequency) {
    Map<string, Vector<int>> result;
    Set<string> tooFrequent;

    for (int page = 0; page < pages.size(); page++) {
        /* Loop over the words in that page. */
        for (string word: tokenize(pages[page])) {
            /* If this is a word that isn't banned, log it. */
            if (isalpha(word[0]) && !tooFrequent.contains(word)) {
                /* Confirm we haven’t already logged it. */
                if (result[word].isEmpty() || result[word].back() != page) {
                    result[word] += page;

                    /* Oops! Too common. */
                    if (result[word].size() > maxFrequency) {
                        tooFrequent += word;
                        result.remove(word);
                    }
                }
            }
        }
    }

    return result;
}

Here’s a second solution. This builds a full index, then copies over the infrequent elements.

Map<string, Set<int>> makeIndexFor(const Vector<string>& pages,
                                   int maxFrequency) {
    Map<string, Set<int>> index;

    /* Build the index. */
    for (int page = 0; page < pages.size(); page++) {
        for (string word: tokenize(pages[page])) {
            if (isalpha(word[0])) {
                index[word] += page;
            }
        }
    }

    /* Remove everything that's too frequent. */
    Map<string, Set<int>> result;
    for (string word: index) {
        if (index[word].size() <= maxFrequency) {
            result[word] = index[word];
        }
    }

    return result;
}

Map<string, Vector<int>> makeIndexFor(const Vector<string>& pages,
                                      int maxFrequency) {
    Map<string, Vector<int>> index;

    /* Build the index. */
    for (int page = 0; page < pages.size(); page++) {
        for (string word: tokenize(pages[page])) {
            if (isalpha(word[0])) {
                /* We visit pages in ascending order. */
                if (index[word].isEmpty() || index[word].back() < page) {
                    index[word] += page;
                }
            }
        }
    }

    /* Remove everything that's too frequent. */
    Map<string, Vector<int>> result;
    for (string word: index) {
        if (index[word].size() <= maxFrequency) {
            result[word] = index[word];
        }
    }

    return result;
}

Why we asked this question: This question was designed to let you show us what you’ve learned about container types. You’ve had the opportunity to explore containers on several of the assignments, and we wanted to see whether you were comfortable picking the right tools for the job and using those tools correctly.

Common mistakes: The most common C++ mistakes we saw on this problem were your vanilla, run-of-the-mill syntax and type errors. We saw a number of loops over containers that were structured incorrectly (for example, mixing up the indices of elements of a vector with the elements at those positions, incorrectly structuring a counting for loop to start from the wrong index, etc.), and many solutions tried applying the isalpha function to an entire string rather than to just the first character of that string.

A fairly subtle but important mistake we saw people make was writing code to this effect:

Map<string, Set<int>> index;      
/* … */                            
Set<int> entries = index[word];    
entries += pageNumber; // ⚠ Oops!

This code appears to update the index entry for a given word by pulling out the Set<int> of all places where that word appears, then adding the page number into that container. The problem is that what this code actually does is make a copy of the index entry for the given word, then update that copy to contain the new page number. Remember that in C++ variables represent actual, honest-to-goodness containers, not references to them, so declaring a new Set<int> variable really does create a new Set<int>, rather than a reference to one.

We also saw a number of logic errors that arose when trying to cap the number of times an item could appear in the index. Many solutions that used the Map<string, Vector<int>> type added multiple copies of matching page numbers to the Vector<int>, causing the calculation of how many pages a word appeared on to be skewed by the frequency of the word on those pages. Some solutions recognized when a word appeared the maximum number of times, but rather than removing the word instead stopped adding more matching page numbers to the index, essentially returning an index where only the first maxFrequency copies of each word appeared in the result.

One of the most common logic errors we encountered was removing a word from the index once it had reached the maximum permissible frequency, but then allowing that word to get added back in to the index. This usually happened when the logic to build the index was combined with the logic to remove words that appeared too frequently.


3) Mongolian Recursion

Consider the following recursive functions, selenge and zavkhan, which differ only in their last lines.

string selenge(const string& str) {
  if (str.length() <= 1) {
    return str;
  } else {
    string temp;

    /* Append these characters to the
     * temp string.
     */
    temp += str[1];
    temp += str[0];

    /* This next line means "get the
     * string formed by dropping the
     * first two characters from str."
     */
    string next = str.substr(2);

    /* The next line differentiates
     * selenge from zavkhan.
     */
    return temp + selenge(next);
  }
}
string zavkhan(const string& str) {
  if (str.length() <= 1) {
    return str;
  } else {
    string temp;

    /* Append these characters to the
     * temp string.
     */
    temp += str[1];
    temp += str[0];

    /* This next line means "get the
     * string formed by dropping the
     * first two characters from str."
     */
    string next = str.substr(2);

    /* The next line differentiates
     * zavkhan from selenge.
     */
    return zavkhan(next) + temp;
  }
}

Answer each of the following questions and fill in the appropriate blanks. No other justification is needed.

i. How many choices of string s are there where selenge(s) returns "bulgan"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "________".
  • There are two or more choices of s where this happens, such as "________" and "________".

ii. How many choices of string s are there where zavkhan(s) returns "bulgan"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "________".
  • There are two or more choices of s where this happens, such as "________" and "________".

iii. How many choices of string s are there where selenge(s) returns "khovd"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "________".
  • There are two or more choices of s where this happens, such as "________" and "________".

iv. How many choices of string s are there where zavkhan(s) returns "khovd"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "________".
  • There are two or more choices of s where this happens, such as "________" and "________".

Consider the following recursive functions, selenge and zavkhan, which differ only in their last lines.

string selenge(const string& str) {
  if (str.length() <= 1) {
    return str;
  } else {
string temp;
    /* Append these characters to the
     * temp string.
     */
    temp += str[1];
    temp += str[0];
    /* This next line means "get the
     * string formed by dropping the
     * first two characters from str."
     */
    string next = str.substr(2);
    /* The next line differentiates
     * selenge from zavkhan.
     */
    return temp + selenge(next);
} }
string zavkhan(const string& str) {
  if (str.length() <= 1) {
    return str;
  } else {
string temp;
    /* Append these characters to the
     * temp string.
     */
    temp += str[1];
    temp += str[0];
    /* This next line means "get the
     * string formed by dropping the
     * first two characters from str."
     */
    string next = str.substr(2);
    /* The next line differentiates
     * zavkhan from selenge.
     */
    return zavkhan(next) + temp;
} }

Answer each of the following questions and fill in the appropriate blanks. No other justification is needed.

i. How many choices of string s are there where selenge(s) returns "bulgan"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "ubglna".
  • There are two or more choices of s where this happens, such as "__" and "__".

ii. How many choices of string s are there where zavkhan(s) returns "bulgan"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "naglub".
  • There are two or more choices of s where this happens, such as "__" and "__".

iii. How many choices of string s are there where selenge(s) returns "khovd"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "hkvod".
  • There are two or more choices of s where this happens, such as "__" and "__".

iv. How many choices of string s are there where zavkhan(s) returns "khovd"?

  • There are no choices of s for which this happens.
  • There is exactly one choice of s where this happens, namely s = "dvohk".
  • There are two or more choices of s where this happens, such as "__" and "__".

First, let’s look at selenge. A quick scan through the code gives the idea that this works on the level of individual characters from a string, so let’s pick a sample string – say, "abcde" – and see what it does with it. We get that

  • selenge("abcde") = "ba" + selenge("cde")
  • selenge("cde") = "dc" + selenge("e")
  • selenge("e") = "e"

Putting all this together, we get back the string "badce". That’s our original string, with adjacent pairs of letters, except for the final letter, swapped. Now, was that a coincidence, or is that actually what the function does? Let’s take a second look at this function.

  • The base case leaves any string of lengths 0 and 1 unchanged. That’s consistent with twiddling adjacent pairs – there are no adjacent pairs in this case.
  • The recursive case twiddles the first two characters, then tacks on what you get by twiddling the remaining characters. That’s what we’d expect to find if we were to twiddle all adjacent pairs.

So that’s what the selenge function does – it swaps adjacent pairs of characters, leaving the last character untouched if it happens to be the odd one out.

Now, what’s zavkhan all about? This function looks pretty similar, so let’s do a similar analysis and try feeding in the string "abcde". That gives us the following:

  • zavkhan("abcde") = zavkhan("cde") + "ba"
  • zavkhan("cde") = zavkhan("e") + "dc"
  • zavkhan("e") = "e"

Putting all this together, we see that zavkhan("abcde") gives back "edcba" – the reverse of our original string! That’s too weird to be a coincidence, and we can dive deeper to see why this is.

  • The base case takes any string of length zero or length one and hands it back unmodified. But we can also think of this as returning the reverse of the original string.
  • The recursive case takes the first two characters, swaps their ordering, and tacks them onto the back of the reverse of the rest of the string. That reverses the characters from their original ordering!

So selenge swaps adjacent characters and that zavkhan reverses its input. With that in mind:

i. How many choices of string s are there where selenge(s) returns "bulgan"?

There’s just one – it’s the string you get by swapping the adjacent letters in "bulgan", since swapping adjacent pairs twice cancels out the effect. That’s the string "ubglna".

ii. How many choices of string s are there where zavkhan(s) returns "bulgan"?

Just one – the reverse of "bulgan", which is "naglub".

iii. How many choices of string s are there where selenge(s) returns "khovd"?

The only option is "hkvod", which untwiddles to "khovd".

iv. How many choices of string s are there where zavkhan(s) returns "khovd"?

As before, the only choice is the reverse of the string, "dvohk".

Why we asked this question: This question, along the lines of Problem One, was about reading code and building a mental model for what it does. Here, we chose to write two recursive functions that differ in a very small – but, as you saw above – very important way! We hoped that you could use your skills in tracing recursive functions to get a sense of how these pieces of code worked, and from there to answer these questions by thinking through what inputs would be needed to get the desired outputs.


4) Doing the Splits

Your ultimate goal in this problem is to write a function Set<Vector<string>> splitsOf(const string& str); that takes as input a string, then returns all ways of splitting that string into a sequence of nonempty strings called pieces. For example, given the string "RUBY", you’d return a Set containing these Vector<string>s; notice that all letters are in the same relative order as in the original string:

{"R", "U", "B", "Y"}
{"R", "U", "BY"}
{"R", "UB", "Y"}
{"R", "UBY"}
{"RU", "B", "Y"}
{"RU", "BY"}
{"RUB", "Y"}
{"RUBY"}

If you take any one of these Vectors and glue the pieces together from left to right, you’ll get back the original string "RUBY". Moreover, every possible way of splitting "RUBY" into pieces is included here. Each character from the original string will end up in exactly one piece, and no pieces are empty.

Given the string "TOPAZ", you’d return a Set containing all of these Vector<string>s:

{"T", "O", "P", "A", "Z"}
{"T", "O", "P", "AZ"}
{"T", "O", "PA", "Z"}
{"T", "O", "PAZ"}
{"T", "OP", "A", "Z"}
{"T", "OP", "AZ"}
{"T", "OPA", "Z"}
{"T", "OPAZ"}
{"TO", "P", "A", "Z"}
{"TO", "P", "AZ"}
{"TO", "PA", "Z"}
{"TO", "PAZ"}
{"TOP", "A", "Z"}
{"TOP", "AZ"}
{"TOPA", "Z"}
{"TOPAZ"}

As before, notice that picking one of these Vectors and gluing the pieces in that Vector back together will always give you "TOPAZ".

The decision tree for listing subsets is found by repeatedly considering answers to questions of the form “should I include or exclude this element?” The decision tree for listing permutations is found by repeatedly considering answers to questions of the form “which element should I choose next?” In this problem, the decision tree is found by repeatedly considering answers to this question: How many characters from the front of the string should I include in the next piece? Based on this insight, draw the decision tree for listing all splits of the string "JET" along the lines of the decision trees we drew in class. At a minimum, please be sure to do the following:

  • Label each entry in the decision tree with the arguments to the recursive call it corresponds to.
  • Label each arrow in the decision tree with what choice it corresponds to. Now, implement the splitsOf function. For full credit, your implementation must be recursive and match the decision tree that you drew in the first part of this problem.

For starters, here's the decision tree for all splits of "JET": Decision tree for "JET". The possible splits for JET are: J,E,T; J,ET; JE,T; JET

Here, each decision is of the form “how many characters are we taking off the front of the string?,” and we pass down through the recursion both the characters we haven’t used yet and the split we’ve built up so far.

Here’s one implementation of splitsOf based on the above decision tree:

Set<Vector<string>> splitsRec(const string& str,const Vector<string>& chosen) {
    /* Base Case: If we've already used up all the characters of the input string,
     * then we've already built up one option. The set of all possible options we
     * can make given that we're locked into this one solution is just the set of
     * that solution by itself.
     */
     if (str == "") {
       return { chosen };
     }
    /* Otherwise, there are characters that need to get put into a piece of the
     * split. Try all ways of doing this.
     */
      else {
        Set<Vector<string>> result;
        /* Munch off between 1 and all the characters, inclusive. */
        for (int i = 1; i <= str.length(); i++) { string piece = str.substr(0, i); string remaining = str.substr(i);
            /* Find all the splits we can make, assuming we commit to having this
             * piece in front.
             */
            result += splitsRec(remaining, chosen + piece);
        }
        return result;
      }
}
Set<Vector<string>> splitsOf(const string& str) {
  return splitsRec(str, {});
}

Why we asked this question: This question was designed to get you playing around with decision trees and recursive enumeration problems. We asked the first part of this question to see whether you were comfortable going from a high-level strategy (here, munching off some number of characters from the front of the string) to a decision tree, which essentially maps out what recursive calls need to be made and with what arguments.

The second part of this question was designed both to see if you were comfortable with turning a recursive insight / decision tree into a piece of recursive code and to see if you had built an understanding of how to produce all objects of some type. This particular recursive function works best with a for loop, not because the problem intrinsically screams “I need a loop here,” but rather because the shape of the recursion tree requires us to make decisions about where in a string to cut and the easiest way to do that happens to be with a loop.

Zooming out a bit, we liked this problem in particular because it’s fairly similar to the recursive structure of listing permutations! Notice that, as in permutations, you iterate over all splitting points and fire off recursive calls from each of those options. The difference here is that instead of pulling out a single character by using substr, here we really are trimming the original string down.

Common mistakes: On the decision tree front, the most common mistake we saw was to draw the decision tree in a way where each box in the tree was labeled with the result of the previously-made decision, rather than labeled with the next decision to make or the remaining string that we need to make the next decision on. Remember that the arrows in a decision tree, not the boxes, indicate what the choices are. Aside from this, we also saw a number of solutions that used an include/exclude pattern (more along the lines of what we’d do for subsets or combinations) than a “how many” pattern (which was needed here). We also saw some submission that were simply the decision trees for subsets or permutations, which isn’t what was asked for here.

On the coding side of things, the most common errors we encountered were simple off-by-one indexing errors when pulling some number of characters off the front of the string. For example, we saw many solutions that tried pulling between 0 and n – 1 characters off the front of an n-character string rather than between 1 and n characters from the front of an n-character string.

We also saw a number of solutions that mixed up the types of the relevant terms in the recursion. The return value from this function is a Set<Vector<string>>, where

  • the string type represents an actual string,
  • the Vector<string> type represents a split of a string into pieces, and
  • a Set<Vector<string>> represents a collection of splits of strings.

Many solutions attempted to add individual strings to the resulting Set<Vector<string>>, which doesn’t work because the types don’t agree (this would be treating a piece of a string as a full split of a string), or which returned a Vector<string> rather than a Set<Vector<string>> from the base case (mixing up a single split with a collection of possible splits), etc.

Another tricky spot in this problem was aggregating together all the results of the different recursive calls. To solve this problem, you need to make a series of recursive calls, each of which produces a single Set<Vector<string>>, and you need to assemble them into a single, overall Set<Vector<string>>. We encountered many solutions that would make the recursive calls but not do anything with the return values of those calls, meaning that the results wouldn’t be stored for later use. Some solutions immediately returned the value of the first recursive call, preventing other calls from being made later on. Others simply forgot to return a value at the end of the function.

Finally, we saw some solutions that used the wrong recursive strategy to solve the problem. Many solutions attempted to use an include/exclude recursive pattern along the lines of what you’d do for listing subsets or combinations. It’s entirely possible to solve this problem this way, though it’s fairly challenging to do this correctly; this could be done by passing down both a partially-built string representing the split currently under construction as well as a partially-built Vector<string> representing the collection of splits assembled at this point. However, in most cases where we saw this the recursive code written didn’t match the recursion tree drawn on the previous page and the code didn’t properly construct all possible splits.


5) Eating a Chocolate Bar

You have a chocolate bar that’s subdivided into individual squares. You decide to eat the bar according tothe following rule:

if you choose to eat one of the chocolate squares, you have to also eat every square below and/or to the right of that square.

For example, here’s one of the many ways you could eat a 3 x 5 chocolate bar while obeying the rule. The star at each step indicates the square chosen out of the chocolate bar, and the gray squares indicate which squares must also be eaten in order to comply with the above rule.

An image of a 3 x 5 grid with the 2 x 2 portion in the bottom right corner shaded in with a star in the top left corner of the shaded grid. Followed by an arrow pointing to a 3 x 5 grid missing the bottom right 2 x 2 square with the third column as well as the two squares in the top row to the right of the third column shaded. There is a star in the top left corner of this shaded area. Followed by a 3 x 2 grid where the bottom two rows are shaded and there is a star in the top left corner of this shaded square. Followed by a 1 x 2 which is completely shaded and there is a star in the left hand square.

The particular choice of the starred square at each step was completely arbitrary, but once a starred square is picked the choice of grayed-out squares is forced. You have to eat the starred square, plus each square that’s to the right of that square, below that square, or both.

The above route isn't only one way to eat the chocolate bar. Here’s another:

An image of a 3 x 5 grid where the two squares on the right of the bottom row are shaded with the left most shaded square containing a star. Followed by a 3 x 5 grid missing the two bottom right squares and the square in the fifth column in the middle row is shaded and contains a star. Followed by a 3 x 5 square where all remaining squares in columns 2, 3, 4, and 5 are shaded with a start in the square in the top row of column 2. Followed by a 3 x 1 grid where the bottom two squares are shaded with a star in the middle square. Followed by a single shaded square containing a star.

As before, there’s no particular pattern to how the starred squares were chosen, but once we know which square is starred the choice of gray squares is forced.

One more example. If you are so hungry you can’t wait, you could eat the chocolate bar this way:

A shaded 3 x 5 grid with a star in the square in the top left

This eats the entire bar in a single bite. Ah, gluttony.

Your task is to write a function

int waysToEat(int numRows, int numCols);

that returns the number of different ways you could eat a numRows × numCols chocolate bar while obeying the above rule. Your waysToEat function will almost certainly be a wrapper around a different recursive function that actually computes the result. Think about what information you need to keep track of and what type or types you might use to represent that information. Some notes on this problem:

  • You can assume that numRows and numCols cannot be negative and don’t need to worry about what happens in this case. However, numRows and/or numCols may be zero.
  • You are encouraged to write as many helper functions as you’d like. To save you time, you don’t need to write function prototypes.
  • The chocolate bar never flips or rotates; the square in the upper-left corner will stay there until you’ve eaten the whole bar.
  • You do not need to worry about efficiency.
  • Your solution must be recursive. That’s kinda what we’re testing here.

Here’s one possible solution. The recursive insight here is to represent the current state of the chocolate bar (we used a Grid<bool> for this – a square is true if it’s still there and false otherwise) and to consider, at each point, all the possible next bites we can take. So what should our base case be? This one is a bit subtle. If you have a fully eaten chocolate bar, how many ways are there to reduce it down to an empty chocolate bar? The answer is one – it’s already been reduced, so the action “do nothing” is a valid way to shrink it down to nothingness.

The remainder of the code here consists of helper functions that come up in the course of solving the problem. We need a way of checking if a Grid<bool> is entirely false, and we need a way to simulate taking a giant bite out of a Grid.

/* Is this chocolate bar fully eaten? */
bool isEmpty(const Grid<bool>& candyBar) {
    /* We could iterate over the whole grid to see if everything has been eaten,
     * but it’s easier to just check if the upper-left corner has been eaten.
     */
     return candyBar.isEmpty() || !candyBar[0][0];
}
/* Given a chocolate bar, give what would be left if you munched (row, col). */
Grid<bool> chompAt(const Grid<bool>& candyBar, int row, int col) {
  Grid<bool> result = candyBar;
  for (int r = row; r < candyBar.numRows(); r++) {
    for (int c = col; c < candyBar.numCols(); c++) { // Ooh! C++!
      result[r][c] = false;
    }
  }
  return result;
}
/* How many ways are there to eat this candy bar? */
int waysToEatRec(const Grid<bool>& candyBar) {
    /* Base case: All done? Then there's only one option, which is to do
     * nothing because there's nothing to eat.
     */
  if (isEmpty(candyBar)) {
    return 1;
  }
  /* Recursive case: try munching each spot. */
  int numWays = 0;
  for (int row = 0; row < candyBar.numRows(); row++) {
    for (int col = 0; col < candyBar.numCols(); col++) {
      if (candyBar[row][col]) {
        numWays += waysToEatRec(chompAt(candyBar, row, col));
      }
    }
  }
  return numWays;
}
int waysToEat(int numRows, int numCols) {
  Grid<bool> candyBar(numRows, numCols, true);
  return waysToEatRec(candyBar);
}

There are many other ways to represent the state of the chocolate bar. Some solutions used a Set<Vector<int>>, where each Vector<int> represents a single location (the first element is the row and the second is the column) and a Set<Vector<int>> represents all the uneaten locations. Other solutions used a single Vector<int>, where each entry represents a row with the value meaning how many squares are left in the row.

As a note, this function is painfully slow if you don’t use memoization. I encourage you to try this yourself! If you switch to a solution that involves memoization – say, by using a Map<Grid<bool>, int> – you’ll get answers back much, much faster!

Another note: the int type in C++ maxes out at around 2.7 billion on many systems, and the values this function produces get much larger than that very quickly. If you want to store much larger integers – ones that go up to approximately 1018, replace int with int64_t (it’s defined in <cstdint>).

Why we asked this question: We chose to include this question for several reasons. First, we wanted to give at least one “unguided” recursion problem, one where we gave the problem statement and not much else. We figured that would be a good capstone after the series of the previous problems (understand the recursive code given to you, then write a recursive function given the recursive insight, and finally write a recursive function without any specific instruction).

Second, we figured that this problem would be a good way to see how comfortable you were decomposing larger problems down into smaller pieces. There are several subtasks that need to be handled here: taking a bit out of the chocolate bar, checking if the bar is empty, and iterating over the bar to find all the places to bite next. Trying to do all of that in the context of a recursive function is tall order, but if you split the jobs apart into multiple smaller helper functions, it’s a lot more manageable.

Third, we wanted to give you a recursive problem that required you to aggregate information across multiple recursive calls in a way that wasn’t an exact match for what we’d done in class. Over the course of the assignments, you saw several recursive problems (Human Pyramids, Shift Scheduling) in which you needed to explore a space and return a single result that aggregates information together from multiple sources. One of the main tasks in this problem is figuring out how to combine the answers to the smaller recursive problems together into a larger answer. Our solution works by returning a single number, which we think is probably the easiest solution, though you could also solve this by passing an integer down by reference and incrementing it in the base case.

Finally, we thought that this problem was interesting in and of itself. It’s astonishing how many ways there are to eat candy bars given this rule. There’s 1,036,972 options! We actually don’t have a general mathematical formula for how many ways there are to eat an m × n chocolate bar, and it seems like finding a closed-form solution is fairly tricky. If you have a solid math background and know such a formula, please let us know!

Common mistakes: One of the most common mistakes we saw on this problem was incorrectly handling the case where the chocolate bar is empty. This is a subtle but important detail in the problem setup: how many ways are there to eat an empty chocolate bar?

Let’s reason by analogy. If we were writing a recursive function to list all permutations of a string, and we reached the base case of “there are no characters left in the string.” How many permutations should we return at this point? The answer would be one – it’s the permutation corresponding to the choices we’ve committed to so far. Similarly, imagine we wanted to list all subsets of a set, and we hit our base case of having an empty set of unconsidered elements in the set. How many sets would we print out? The answer would be one – the set corresponding to the decision we’ve made so far.

For the same reason, suppose you have an empty chocolate bar and you want to count all the ways to eat it. How many are there? The answer is one. You can see this in two ways:

  • The strategy of “do not eat anything” counts as one way to eat the chocolate bar.
  • If we’re keeping track of the sequence of steps we’ve taken to eat the chocolate bar and reach a point where no chocolate is left, then the sequence we’ve committed to forms one way to eat the chocolate bar.

Many solutions correctly identified the base case would be the empty chocolate bar but returned 0 rather than 1 in this case. Others had the logic completely correct in the recursive case but added special-case logic to a wrapper function to return 0 if the bar was already empty.

On the topic of the base case, some solutions correctly identified that they needed to stop when the chocolate bar was empty, but had in error in how they did so. Some solutions checked if the chocolate bar had dimensions 0 × 0, which wouldn’t account for the case where the chocolate bar was m × 0 or 0 × n. Others checked if the size of the chocolate bar was zero, but then represented the chocolate bar with a fixed-sized grid whose dimensions never changed across the recursion.

The next common class of mistakes we saw were errors in propagating information across the recursive calls. We saw many solutions that would make a recursive call to count how many ways there were to eat the chocolate bar after making some initial bite, but which then never looked at the return value from that function and therefore didn’t factor in that information into the overall total. Other solutions would return the result of the first recursive call made, meaning that the recursion would only explore a single branch and therefore not try out all the different possibilities. A few solutions used both an outparameter to store the overall result and used the return value to return the overall result – this introduces some redundancy into the code and increases the likelihood of making mistakes tracking those results.

We also encountered many solutions that didn’t properly track the shape of the chocolate bar. Many solutions purely represented the chocolate bar by tracking its dimensions, which would be fine if the chocolate bar was always a rectangle. However, the shape of the chocolate bar can get pretty complex pretty quickly, and just knowing the overall bounding rectangle dimensions doesn’t allow you to track which squares have and haven’t been eaten.


6) No More Counting Dollars, We'll Be…

You have a road network for a country represented as a Map<string, Set<string>>. Each key is the name of a city, and the associated value consists of all the cities that are adjacent to that city. You can assume that the road network is bidirectional: if A is adjacent to B, then B is adjacent to A. You can also assume no city is adjacent to itself.

A star is a cluster of four or more cities arranged as follows: there’s a single city in the center that’s connected to all the other cities in the cluster, and every other city in the cluster is only connected to the central city. Here are some stars shown below:

Image shows two node graphs. The first graph is made of 5 nodes, where there is one node in the center and nodes attached to the central node from all four cardinal directions. The second graph is made of 4 nodes. There is, one central node, one node attached to the central node from above, one node attached to the central node from the bottom right, and one node attached to the central node from the left

Stars commonly arise in road networks in smaller island nations: you have a main, central city (often acapital city) and a bunch of smaller, outlying towns.

Here are some examples of groups of cities that aren’t stars. The group on the left isn’t a star because two of the peripheral cities are connected to one another (and therefore not just the central city). The group in the center isn’t a star because one of the peripheral cities is connected to a city besides the central one. Finally, the group on the right isn’t a star because it doesn’t have the minimum required number of nodes.

The image contains three graphs. The first graph is made of 5 nodes, one central node with nodes attached to the central node from all for cardinal directions. The north and west nodes are also connected. The second graph has 6 nodes in total. 4 of the nodes are connected in a straight line. There is one node connected to the top of the node that is second from the right, and one node connected to the bottom of the node that is second from the right

If you have a country that consists of a large archipelago, you might find that its road network consists of multiple different independent stars. In fact, the number of stars in a road network is a rough proxy for how decentralized that country is.

Your task is to write a function

int countStarsIn(constMap<string, Set<string>>& network);

that takes as input the road network, then returns the number of stars in the network. You don’t need to worry about efficiency, but do be careful not to count the same star multiple times.

One way to solve this problem is to count the number of cities that are at the centers of stars – this ensures that each star gets counted once and exactly once. We can check whether a city is a star center by checking whether it has at least three neighbors and that each of those neighbors only has a single neighbor. Here’s one way to do this:

bool isStarCenter(const string& city,const Map<string, Set<string>>& network) {
    /* Stars need at least four cities, including the center. */
    if (network[city].size() < 3) {
      return false;
    }
    /* All cities aside from the center must just link to the center. */
    for (string neighbor: network[city]) {
      if (network[neighbor].size() != 1) {
        return false;
      }
    }
    return true;
}

int countStarsIn(const Map<string, Set<string>>& network) {
  int result = 0;
  for (string city: network) {
    if (isStarCenter(city, network)) {
      result++;
    }
  }
  return result;
}

Why we asked this question: This question was designed to get you playing around with container types (here, Map and Set) and nested containers while also touching on an interesting algorithmic problem. Specifically, we hoped you’d think about the problem abstractly (“how do I determine whether a cluster of items is a star when I’m working one element at a time?”), figure out a strategy that works, and then translate your solution into C++.


7) Proofreading a Report

You’ve been working on preparing a report as part of a larger team. Each person on the team has been tasked with writing a different section of the report. To make sure that the final product looks good and is ready to go, your team has decided to have each person in the team proofread a section that they didn’t themselves write.

There are a lot of ways to do this. For example, suppose your team has five members conveniently named A, B, C, D, and E. One option would be to have A read B’s section, B read C’s section, C read D’s section,D read E’s section, and E read A’s section, with everyone proofreading in a big ring. Another option would be to have A and B each proofread the other’s section, then have C proofread D’s section, D read E’s section, and E read C’s section. A third option would be to have A read E’s work, E read C’s work, and C read A’s work, then to have B and D proofread each other’s work. The only restrictions are that (1) each section needs to be proofread by exactly one person and (2) no person is allowed to proofread their ownwork.

Write a function

that lists off all ways that everyone can be assigned a person’s work to check so that no person is assigned to check their own work. For example, given the five people listed above, this function might print the following output:

A checks B, B checks A, C checks E, D checks C, E checks D
A checks B, B checks A, C checks D, D checks E, E checks C
A checks C, B checks A, C checks B, D checks E, E checks D
(... many, many lines skipped ...)
A checks C, B checks E, C checks D, D checks B, E checks A
A checks D, B checks C, C checks E, D checks B, E checks A
A checks E, B checks C, C checks D, D checks B, E checks A

Some notes on this problem:

  • You’re free to list the proofreading assignments in any order that you’d like. However, you should make sure that you don’t list the same assignment twice.
  • Your function should print all the arrangements it finds to cout. It shouldn’t return anything.
  • Your solution needs to be recursive – that’s kinda what we’re testing here.
  • While in general you don’t need to worry about efficiency, you should not implement this function by listing all possible permutations of the original group of people and then checking each one to see whether someone is assigned to themselves. That ends up being a bit too slow to be practical.
  • Your output doesn’t have to have the exact same format as ours. As long as you print out something that makes clear who’s supposed to proofread what, you should be good to go. In case it helps, youmay want to take advantage of the fact that you can use cout to directly print out a container class (Vector, Set, Map, Lexicon, etc.). For example, if you have a Map named myMap, the C++ code __cout « myMap « endl; __ prints out all the key/value pairs.
  • As a hint, focus on any one person in the group. You know that they’re going to have to proofread some section. Consider exploring each possible way they could do so.
void listAllProofreadingArrangements(const Set<string>& people);

There are many ways to approach this problem. Our solution works by keeping track of two groups – the group of people who haven’t proofread anything and the group of people who haven’t had their sections proofread – and at each point chooses a person and gives them a section to read. We try out all options except for those that directly assign a person their own section.

Here’s what this looks like in code:

void listAssignmentsRec(const Set<string>& unassignedReaders,
                        const Set<string>& unassignedSections,
                        const Map<string, string>& assignments) {
    /* Base Case: If everyone is assigned to read something, print out the
     * reading assignments.
     */
    if (unassignedReaders.isEmpty()) {
        cout << assignments << endl;
    }
    /* Recursive Case: Choose a reader and look at all ways to give them
     * something to read.
     */
    else {
        string reader = unassignedReaders.first();

        /* Loop over all sections that haven't yet been assigned. */
        for (string section: unassignedSections) {
            /* Don’t assign someone to read their own section. */
            if (section != reader) {
                /* Explore all options that include giving this person this
                 * section to read.
                 */
                auto newAssignments = assignments;
                newAssignments[reader] = section;

                listAssignmentsRec(unassignedReaders  - reader,
                                   unassignedSections - section,
                                   newAssignments);
            }
        }
    }
}

void listCheckingArrangements(const Set<string>& people) {
    listAssignmentsRec(people, people, {});
}

Why we asked this question: We chose this particular problem for a number of reasons. First, we included it because it’s closely related to the classic problem of listing permutations, which by this point we hoped you’d have a good handle on. Second, we liked how this problem forced you to think back to the intuition behind how to generate permutations (choose a position and think of all the things that could go into that position), since the problem structure was sufficiently different from the traditional permutations problem as a consequence of having the items stored in an inherently unordered structure. Third, we thought this problem would allow you to demonstrate what you’d learned about using extra arguments to recursive functions in order to keep track of the decisions made so far and deci-sions left to make; our solution requires two extra arguments to remember unassigned sections and unassigned people separately. Finally, we wanted to include this problem because it has nice mathematical properties; the assignments you’re asked to list here are called derangements and show up frequently in CS theory.

Common mistakes: We saw a number of mistakes on this problem that were focused on small details of the problem rather than the overall structure. A number of solutions correctly attempted to keep track of which sections weren’t read and who hadn’t read anything, but accidentally looped over the wrong one when trying to determine which section to assign or which person to assign something to. Some solutions did correctly list all the options, but did so by generating all permutations and then filtering out invalid ones at the last second (which was specifically prohibited by the problem statement). Other solutions modified a map or set while iterating over it, which leads to runtime errors.

If you had trouble with this problem: Our advice on how to proceed from here varies a bit based on what specific difficulty you were having. If you looked at this problem and didn’t recognize that you were looking at a permutations-type problem, we would recommend trying to get more practice looking at these sorts of problems and broadly categorizing them into subtypes like “permutation problems,” “subset problems,” etc. Section Handout 3 and Section Handout 4 have a number of good problems to work through that might be good spots to look for questions like those.

If you had trouble avoiding repeated assignments in the course of solving this problem, we’d recommend taking some time to look at the different recursive functions we’ve written in the past that enumerate options to see how their design prevents duplicates. Why, for example, does the “list all sub-sets” function we wrote when we first talked about recursion not list any subsets twice? Why does the “list all permutations” function list each option once? See if you can connect this back to recursion trees.

If you weren’t sure how to keep track of which people were assigned sections and which sections had been assigned readers, it might be worth practicing working with the decision tree model of recursive problem-solving. At each point in time, your goal is to make some sort of choice about what the next option will be. What information would you need to have access to in order to make that choice? What would the possible choices look like? A little extra practice with this skill can go a long way.


8) Avoiding Sampling Bias

One of the risks that comes up when conducting field surveys is sampling bias, that you accidentally survey a bunch of people with similar backgrounds, tastes, and life experiences and therefore end up with a highly skewed view of what people think, like, and feel. There are a number of ways to try to control for this. One option that’s viable given online social networks is to find a group of people of which no two are Facebook friends, then administer the survey to them. Since two people who are a part of some similar organization or group are likely to be Facebook friends, this way of sampling people ensures that you geta fairly wide distribution.

For example, in the social network shown below (with lines representing friendships), the folks shaded in gray would be an unbiased group, as no two of them are friends of one another.

There is a graph of 8 people, lets letter them A through H. Person A is unshaded and connected to Person B and Person C. Both Person B and Person C are shaded and both people are connected to Person D, but not to each other. Person D is unshaded. Person D is connected to Person E, Person F, and Person G. Person E and Person G are shaded, Person F is not shaded. BothPerson E and Person G are connected to Person F, but are not connected to each other. Person H is connected to Person F. Person H is shaded.

Your task is to write a function

Set<string> largestUnbiasedGroupIn(const Map<string, Set<string>>&network);

that takes as input a Map representing Facebook friendships (described later) and returns the largest group of people you can survey, subject to the restriction that you can’t survey any two people who are friends.

The network parameter represents the social network. Each key is a person, and each person’s associ-ated value is the set of all the people they’re Facebook friends with. You can assume that friendship issymmetric, so if person A is a friend of person B, then person B is a friend of person A. Similarly, you can assume that no one is friends with themselves.

Some other things to keep in mind:

  • You need to use recursion to solve this problem – that’s what we’re testing here.
  • Your solution must not work by simply generating all possible groups of people and then checking at the end which ones are valid (i.e. whether no two people in the group are Facebook friends). This approach is far too slow to be practical.

One way to solve this problem is to focus on a single person. If we choose this person, we’d want to pick, from the people who aren’t friends with that person, the greatest number of people that we can. If we exclude this person, we’d like to pick from the remaining people the greatest number of people that we can. One of those two options must be optimal, and it’s just a question of seeing which one’s better.

/**
 * Given a network and a group of permitted people to pick, returns the largest
 * group you can pick subject to the constraint that you only pick people that
 * are in the permitted group.
 *
 * @param network The social network.
 * @param permissible Which people you can pick.
 * @return The largest unbiased group that can be formed from those people.
 */
Set<string> largestGroupInRec(const Map<string, Set<string>>& network,
                              const Set<string>& permissible) {
    /* Base case: If you aren’t allowed to pick anyone, the biggest group you
     * can choose has no people in it.
     */
    if (permissible.isEmpty()) return {};

    /* Recursive case: Pick a person and consider what happens if we do or do not
     * include them.
     */
    string next = permissible.first();

    /* If we exclude this person, pick the biggest group we can from the set of
     * everyone except that person.
     */
    auto without = largestGroupInRec(network, permissible - next);

    /* If we include this person, pick the biggest group we can from the set of
     * people who aren’t them and aren’t one of their friends, then add the
     * initial person back in.
     */
    auto with = largestGroupInRec(network,
                                  permissible - next - network[next]) + next;

    /* Return the better of the two options. This uses the ternary conditional
     * operator ?:. The expression expr? x : y means "if expr is true, then
     * evaluate to x, and otherwise evaluate to y."
     */
    return with.size() > without.size()? with : without;
}

Set<string> largestUnbiasedGroupIn(const Map<string, Set<string>>& network) {
    Set<string> everyone;
    for (string person: network) {
        everyone += person;
    }
    return largestGroupInRec(network, everyone);
}

Why we asked this question: We included this problem for several different reasons. First, we wanted to include at least one “subset-type” recursion problem and figured that this particular problem would be a nice way to do this. Second, we liked how this problem connected back to the assignments: it’s along the lines of the Shift Scheduling problem from Assignment 3, in that you need to find a group of items that don’t conflict with one another. Third, we thought that the recursive optimization here was particularly elegant to code up once you had the right insights. You can solve this either by keep-ing track of what options are permissible (as we did here) or by which options are impermissible (track which items can’t be chosen again), and can code the solution up in a number of different ways. Finally, this problem is related to a famous problem called the maximum independent set problem, which has a rich history in computer science and shows up in all sorts of interesting practical and theoretical contexts.

Common mistakes: We saw a number of solutions that recognized that picking someone precluded any future call from choosing any of that person’s friends, but which forgot to also mark that the chosen person couldn’t be included in future recursive calls (oops!), which is fairly easy to correct but an easy mistake to make.

If you had trouble with this problem: As with Problem Three, our advice on what the next steps should be if you had trouble with this particular problem depend on which aspect of the problem you were having trouble with.

If, fundamentally, you didn’t recognize that this problem was a subset-type problem, then your first step should probably be to just get a little bit more practice working with recur-sion problems and trying to recognize their structure. You might benefit from revisiting some of the older section problems (Section Handout 3 and Section Handout 4 have a bunch of great recursion problems on them. Without looking at the solutions, try to see if you can predict which of them use a subsets-type recursion, which use a permutations-type recursion, and which use something altogether different. Or go back to Assignment 3 and make sure you have a good understanding about which problems fall into which types.

If you had some trouble keeping track of the decisions you’d made so far and how they interacted with the future (that is, making sure you didn’t pick the same thing twice), you may want to get a bit more practice converting between the different container types. This problem is a lot easier to solve if you have the right data structure to remember what’s al-ready been seen and what hasn’t yet been considered. As a general rule, try to find a way to solve the problems you’re working on that involves the minimal amount of “fighting with the code,” even if that means doing some sort of conversion step beforehand.


9) Predecessor Maps

Google Translate is powered, in large part, by a technique called word2vec that builds an understanding of a language by looking at the context in which each word occurs. ​

Imagine you have a word like “well.” There are certain words that might reasonably appear immediately before the word “well” in a sentence, like “feeling,” “going,” “reads,” etc., and some words that that are highly unlikely to appear before “well,” like “cake,” “circumspection,” and “election.” The idea behind word2vec is to find connections between words by looking for pairs of words that have similar sets of words preceding them. Those words likely have some kind of connection between them, and the rest of the logic in word2vec works by trying to discover what those connections are. ​

Your task is to write a function Map<string, Lexicon> predecessorMap(istream& input); that takes as input an istream& containing the contents of a file, then returns a Map<string, Lexicon> that associates each word in the file with all the words that appeared directly before that word. For example, given JFK’s quote “Ask not what your country can do for you; ask what you can do for your country,” your function should return a Map with these key/value pairs:

"not" : { "ask" }
"what" : { "not", "ask" }
"your" : { "what", "for" }
"country" : { "your" }
"can" : { "country", "you" }
"do" : { "can" }
"for" : { "do" }
"you" : { "for", "what" }
"ask" : { "you" }

Notice that although the word “ask” appears twice in the quote, the first time it appears it’s the first word in the file and so nothing precedes The second time it appears, it’s preceded by some whitespace and a semicolon, but before that is the word “you,” which is what ultimately appears in the Lexicon. ​ Some notes on this problem:

  • You can assume that a token counts as a word if its first character is a letter. You can use the isalpha function to check if a character is a letter.
  • You can assume you have access to the function Vector<string> tokenize(const string& input); that we provided you in Assignment 3.
  • Your code should be case-insensitive, so it should return the same result regardless of the capitalization of the words in the file. The capitalization of the keys in the map is completely up to you.
  • Your code should completely ignore non-word tokens (whitespace, punctuation, quotation marks, etc.) and just look at the words it encounters.
  • It is not guaranteed that the file has any words in it, and there’s no upper bound on how big the file can be.

One way to solve this problem is to keep track of the most-recently-seen word and to use that to help fill in the predecessor map. Here’s a solution that uses this strategy:

Map<string, Lexicon> predecessorMap(istream& input) {
    Map<string, Lexicon> result;
    string lastWord; // Most-recently-read string, initially empty as a sentinel.

    /* The canonical "loop over the lines of a file" loop. */
    for (string line; getline(input, line); ) {
        for (string token: tokenize(line)) {
            /* For consistency, we convert everything to lower-case. */
            token = toLowerCase(token);

            /* If this isn’t a word, we don’t care about it. */
            if (!isalpha(token[0])) {
                continue;
            }

            /* Update the predecessor map, unless this is the first word. */
            if (lastWord != "") {
                result[token].add(lastWord);
            }
            lastWord = token;
        }
    }
    return result;
}

Why we asked this question: This question was designed to explore basic C++ constructs (nested loops, iterating over the contents of a file, etc.), container types (specifically, Maps, Vectors, and Lexi-cons), and general problem-solving (how do you track the information needed to build the result up?). This problem is somewhat related to the Plotter from Assignment 1 – you have to loop over the con-tents of a file while keeping track of some state that persists across loop iterations.


10) The Core

Social networks – Facebook, LinkedIn, Instagram, etc. – are popular because they’re popular. The more people that are on a network, the more valuable it is to join that network. This means that, generally speaking, once a network starts growing, it will tend to keep growing. At the same time, this means that if many people start leaving a social network, it will tend to cause other people to leave as well.

Researchers who study social networks, many of whom work in our CS department, have methods to quantify how resilient social networks are as people begin exiting. One way to do this is to look at the k-core of the network. Given a number k, you can find the k-core of a social network as follows:

  • If everyone has at least k friends on the network, then the k-core consists of everyone on that network.
  • Otherwise, there’s some person who has fewer than k friends on the network. Delete them from the network and repeat this process.

For example, here’s how we’d find the 3-core of the social network shown to the left. At each step, we pick a person with fewer than three friends in the network (shown in red) and remove them. We stop once everyone has three or more friends, and the remaining people form the 3-core.

There are three graphs. The first graph has 6 people, who we will refer to as A-F. Person B is only friends with Person A and Person D and thus is highlighted in red. In the second graph, Person B has been removed. Person A is now highlighted in red as their only remaining friends are Person C and Person D. In the third graph, Person A has been removed so that now only C, D, E and F remain. Person C is friends with all three remaining people. Person D is friends with all 3 remaining people. Person E is friends with all 3 remaining people. Person F is friends with all three remaining people. Since they are all friends, no one is highlighted in red and thus noone is removed.

Intuitively, the k-core of a social network represents the people who are unlikely to leave – they’re tightly integrated into the network, and the people they’re friends with are tightly integrated as well.

Your task is to write a function:

Set<string> kCoreOf(const Map<string, Set<string>>& network, int k);

that takes as input a social network (described below) and a number k, then returns the set of people in the k-core of that network.

Here, the social network is represented as a Map<string, Set<string>>, where each key represents a person and each value is the set of people they’re friends with. You can assume that each person listed as a friend actually exists in the network and that friendship is symmetric: if A is friends with B,then B is also friends with A.

Some notes on this problem:

  • You’re free to implement this one either iteratively or recursively. It’s up to you to decide.
  • Normally, we’d ask you to do error-checking, but for the purposes of this problem you can assumethat k ≥ 0 and you don’t need to validate this.
  • It’s possible that someone in the network has no friends. That person would be represented as a key in the map associated with an empty set.
  • You can assume no one is their own friend.
  • While you don’t need to come up with the most efficient solution possible, you should avoid solution routes that are unnecessarily slow. For example, don’t list off all possible sets of people, then check whether each of them is the k-core.
  • In case it helps, you can assume no person’s name is the empty string.

Here’s one possible way to solve this problem. This one works by repeatedly finding a person without at least k friends left in the core, then removing them from the core.

Set<string> allPeopleIn(const Map<string, Set<string>>& network) {
    Set<string> result;
    for (string person: network) {
        result += person;
    }
    return result;
}
Set<string> kCoreOf(const Map<string, Set<string>>& network, int k) {
    /* Initially, assume everyone is in the core. */
    auto core = allPeopleIn(network);

    while (true) {
        string lonelyPerson;
        bool isLonelyPerson = false;

        for (string person: core) {
            /* See how many people they're friends with who haven't yet been
             * filtered out.
             */
            if ((network[person] * core).size() < k) {
                lonelyPerson = person;
                isLonelyPerson = true;
                break;
            }
        }

        if (!isLonelyPerson) break;
        core -= lonelyPerson;
    }

    return core;
}

Here’s another solution. This one follows the same basic principle as the previous solution, except that instead of maintaining a set of people in the core and leaving the social network unchanged, this one actively modifies the social network whenever someone is removed.

Set<string> kCoreOf(const Map<string, Set<string>>& network, int k) {
    auto coreNet = network;

    while (true) {
        string lonelyPerson;
        bool isLonelyPerson = false;

        for (string person: coreNet) {
            /* See how many people they're friends with who haven't yet been
             * filtered out.
             */
            if (coreNet[person].size() < k) {
                lonelyPerson = person;
                isLonelyPerson = true;
                break;
            }
        }

        if (!isLonelyPerson) break;

        /* Remove this person from the network, and make sure the remaining
         * people don't consider them a friend.
         */
        for (string acquaintance: coreNet[lonelyPerson]) {
            coreNet[acquaintance] -= lonelyPerson;
        }
        coreNet.remove(lonelyPerson);
    }

    /* Build a set of all the remaining people. */
    Set<string> result;
    for (string member: coreNet) {
        result += member;
    }

    return result;
}

Here’s another route that works recursively. It’s essentially the first solution, but written recursively rather than iteratively.

Set<string> kCoreOfRec(const Map<string, Set<string>>& network, int k,
                       const Set<string>& remainingFolks) {
    string lonelyPerson;
    bool isLonelyPerson = false;

    for (string person: remainingFolks) {
        /* See how many people they're friends with who haven't yet been
         * filtered out.
         */
        if ((network[person] * remainingFolks).size() < k) {
            lonelyPerson = person;
            isLonelyPerson = true;
            break;
        }
    }

    /* Base case: If everyone has at least k friends, the remaining people form
     * the core.
     */  
    if (!isLonelyPerson) return remainingFolks;

    /* Recursive case: Otherwise, the lonely person isn't in the core, and we
     * want the core of what's left if we remove them.
     */
    return kCoreOfRec(network, k, remainingFolks - lonelyPerson);
}

Set<string> kCoreOf(const Map<string, Set<string>>& network, int k) {
    Set<string> everyone;
    for (string person: network) {
        everyone += person;
    }
    return kCoreOfRec(network, k, everyone);
}

This next solution uses a different perspective on the problem. For starters, note that no one in the network with fewer than k friends can possibly be in the k-core. So we could begin by making a candi-date k-core in the following way: build up a smaller network of people who purely have k or more friends in the original network. When doing this, we might find that some of those people no longer have k or more friends, since some of their friends might not have been copied into the network. We therefore repeat this process – copying over the people in the new network with k or more friends – until we converge on the k-core. We could do this either iteratively or recursively, and just for fun I’ve written it recursively here:

Set<string> kCoreOf(const Map<string, Set<string>>& network, int k) {
    /* Build up a new social network consisting of everyone with k or more
     * friends. For simplicity later on, we're going to make both a new Map
     * representing the network and a new Set of the people in that network.
     */
    Map<string, Set<string>> coreNetwork;
    Set<string> core;

    /* Copy over the people with at least k friends. */
    for (string person: network) {
        if (network[person].size() >= k) {
            coreNetwork[person] = network[person];
            core += person;
        }
    }

    /* If we copied everyone over, great! The set of everyone in the network is
     * the k-core.
     */
    if (core.size() == network.size()) return core;

    /* Otherwise, someone didn't make it. We need to therefore take the new
     * social network and filter down the friend lists purely to the people in
     * the new network.
     */
    for (string person: coreNetwork) {
       coreNetwork[person] *= core; // Intersect friends with people in the core
    }

    return kCoreOf(coreNetwork, k);
}

Why we asked this question: We included this question as practice working with the fundamental container types that we’ve used over the quarter (here, Maps and Sets). You’ve used these types in a number of contexts, and we figured this would be a great place for you to demonstrate what you’d learned along the way. We also included this problem to give you practice breaking a larger task down into smaller, more manageable pieces and translating high-level intuitions into code.

Plus, we thought that this concept from social network analysis was sufficiently interesting that it was worth sharing with you!

Common mistakes: By far the most common mistake on this problem was to only consider each per-son once when deciding whether to remove them. To illustrate why this is an issue, suppose you find a person in the network who has fewer than k friends and therefore needs to be removed from the net-work. That in turn might mean that someone who previously had k or more friends no longer does, and therefore needs to get removed as well. If you make a single pass over the network and have al-ready checked that person, you’ll miss that they now need to be removed. In other words, it’s not enough to just find everyone with fewer than k friends and remove them; you have to then look at the resulting network and repeat this process until it stabilizes on the k-core.

Another issue we saw on this problem had to do with maintenance of the social network after removing a person. Suppose you remove a person from the social network. This requires two steps to exe-cute properly: first, you have to remove them as a key from the network; second, you have to remove them from each of their friends’ associated Sets. If you forget to do this – or don’t do something equivalent – then you can end up in a situation where there’s a person in the network who believes they’re friends with more people than they actually are. Think of it this way – people leave social networks because their friends leave; if you have a friend who leaves a network and you don’t realize they’ve left, their departure is unlikely to influence you.

Another issue we saw, which was somewhat subtle, had to do with modifying collections when iterating over them. If you are iterating over a Map or Set using a range-based for loop and you remove an element from the Map or Set, you will trigger an error and cause the loop to stop operating properly. We thought we’d point this out because modifying collections this way causes problems in most programming languages.

If you had trouble with this problem: This problem is mostly about syncing your intuition for what a piece of code should do with the code itself. If you missed out on the nuances above – either by forgetting to make multiple passes over the network or by forgetting to update the network – there are a couple strategies you can use to improve. The first would be to draw lots of pictures, one of the major themes from this quarter. Once you have a draft of the code, pick a sample network and see what happens when you try your code out on it. Since Maps and Sets are unordered containers, you might ask what happens if you visit the people in the network in different orders. Do you get back the answer you intended to get? Or does something else happen?

The second would be to just get more practice and reps with your coding skills. Work through some of the older section problems from Section Handout 2 or the chapter exercises from the textbook’s sections on collections. There are some great exercises in there, and blocking out the time to work on them can really make a difference.Here’s one possible way to solve this problem. This one works by repeatedly finding a person without at least k friends left in the core, then removing them from the core.

Set<string> allPeopleIn(const Map<string, Set<string>>& network) {
    Set<string> result;
    for (string person: network) {
        result += person;
    }
    return result;
}
Set<string> kCoreOf(const Map<string, Set<string>>& network, int k) {
    /* Initially, assume everyone is in the core. */
    auto core = allPeopleIn(network);

    while (true) {
        string lonelyPerson;
        bool isLonelyPerson = false;

        for (string person: core) {
            /* See how many people they're friends with who haven't yet been
             * filtered out.
             */
            if ((network[person] * core).size() < k) {
                lonelyPerson = person;
                isLonelyPerson = true;
                break;
            }
        }

        if (!isLonelyPerson) break;
        core -= lonelyPerson;
    }

    return core;
}

Here’s another solution. This one follows the same basic principle as the previous solution, except that instead of maintaining a set of people in the core and leaving the social network unchanged, this one actively modifies the social network whenever someone is removed.

Set<string> kCoreOf(const Map<string, Set<string>>& network, int k) {
    auto coreNet = network;

    while (true) {
        string lonelyPerson;
        bool isLonelyPerson = false;

        for (string person: coreNet) {
            /* See how many people they're friends with who haven't yet been
             * filtered out.
             */
            if (coreNet[person].size() < k) {
                lonelyPerson = person;
                isLonelyPerson = true;
                break;
            }
        }

        if (!isLonelyPerson) break;

        /* Remove this person from the network, and make sure the remaining
         * people don't consider them a friend.
         */
        for (string acquaintance: coreNet[lonelyPerson]) {
            coreNet[acquaintance] -= lonelyPerson;
        }
        coreNet.remove(lonelyPerson);
    }

    /* Build a set of all the remaining people. */
    Set<string> result;
    for (string member: coreNet) {
        result += member;
    }

    return result;
}

Here’s another route that works recursively. It’s essentially the first solution, but written recursively rather than iteratively.

Set<string> kCoreOfRec(const Map<string, Set<string>>& network, int k,
                       const Set<string>& remainingFolks) {
    string lonelyPerson;
    bool isLonelyPerson = false;

    for (string person: remainingFolks) {
        /* See how many people they're friends with who haven't yet been
         * filtered out.
         */
        if ((network[person] * remainingFolks).size() < k) {
            lonelyPerson = person;
            isLonelyPerson = true;
            break;
        }
    }

    /* Base case: If everyone has at least k friends, the remaining people form
     * the core.
     */  
    if (!isLonelyPerson) return remainingFolks;

    /* Recursive case: Otherwise, the lonely person isn't in the core, and we
     * want the core of what's left if we remove them.
     */
    return kCoreOfRec(network, k, remainingFolks - lonelyPerson);
}

Set<string> kCoreOf(const Map<string, Set<string>>& network, int k) {
    Set<string> everyone;
    for (string person: network) {
        everyone += person;
    }
    return kCoreOfRec(network, k, everyone);
}

This next solution uses a different perspective on the problem. For starters, note that no one in the network with fewer than k friends can possibly be in the k-core. So we could begin by making a candi-date k-core in the following way: build up a smaller network of people who purely have k or more friends in the original network. When doing this, we might find that some of those people no longer have k or more friends, since some of their friends might not have been copied into the network. We therefore repeat this process – copying over the people in the new network with k or more friends – until we converge on the k-core. We could do this either iteratively or recursively, and just for fun I’ve written it recursively here:

Set<string> kCoreOf(const Map<string, Set<string>>& network, int k) {
    /* Build up a new social network consisting of everyone with k or more
     * friends. For simplicity later on, we're going to make both a new Map
     * representing the network and a new Set of the people in that network.
     */
    Map<string, Set<string>> coreNetwork;
    Set<string> core;

    /* Copy over the people with at least k friends. */
    for (string person: network) {
        if (network[person].size() >= k) {
            coreNetwork[person] = network[person];
            core += person;
        }
    }

    /* If we copied everyone over, great! The set of everyone in the network is
     * the k-core.
     */
    if (core.size() == network.size()) return core;

    /* Otherwise, someone didn't make it. We need to therefore take the new
     * social network and filter down the friend lists purely to the people in
     * the new network.
     */
    for (string person: coreNetwork) {
       coreNetwork[person] *= core; // Intersect friends with people in the core
    }

    return kCoreOf(coreNetwork, k);
}

Why we asked this question: We included this question as practice working with the fundamental container types that we’ve used over the quarter (here, Maps and Sets). You’ve used these types in a number of contexts, and we figured this would be a great place for you to demonstrate what you’d learned along the way. We also included this problem to give you practice breaking a larger task down into smaller, more manageable pieces and translating high-level intuitions into code.

Plus, we thought that this concept from social network analysis was sufficiently interesting that it was worth sharing with you!

Common mistakes: By far the most common mistake on this problem was to only consider each per-son once when deciding whether to remove them. To illustrate why this is an issue, suppose you find a person in the network who has fewer than k friends and therefore needs to be removed from the net-work. That in turn might mean that someone who previously had k or more friends no longer does, and therefore needs to get removed as well. If you make a single pass over the network and have al-ready checked that person, you’ll miss that they now need to be removed. In other words, it’s not enough to just find everyone with fewer than k friends and remove them; you have to then look at the resulting network and repeat this process until it stabilizes on the k-core.

Another issue we saw on this problem had to do with maintenance of the social network after removing a person. Suppose you remove a person from the social network. This requires two steps to exe-cute properly: first, you have to remove them as a key from the network; second, you have to remove them from each of their friends’ associated Sets. If you forget to do this – or don’t do something equivalent – then you can end up in a situation where there’s a person in the network who believes they’re friends with more people than they actually are. Think of it this way – people leave social networks because their friends leave; if you have a friend who leaves a network and you don’t realize they’ve left, their departure is unlikely to influence you.

Another issue we saw, which was somewhat subtle, had to do with modifying collections when iterating over them. If you are iterating over a Map or Set using a range-based for loop and you remove an element from the Map or Set, you will trigger an error and cause the loop to stop operating properly. We thought we’d point this out because modifying collections this way causes problems in most programming languages.

If you had trouble with this problem: This problem is mostly about syncing your intuition for what a piece of code should do with the code itself. If you missed out on the nuances above – either by forgetting to make multiple passes over the network or by forgetting to update the network – there are a couple strategies you can use to improve. The first would be to draw lots of pictures, one of the major themes from this quarter. Once you have a draft of the code, pick a sample network and see what happens when you try your code out on it. Since Maps and Sets are unordered containers, you might ask what happens if you visit the people in the network in different orders. Do you get back the answer you intended to get? Or does something else happen?

The second would be to just get more practice and reps with your coding skills. Work through some of the older section problems from Section Handout 2 or the chapter exercises from the textbook’s sections on collections. There are some great exercises in there, and blocking out the time to work on them can really make a difference.


11) Cosmic Care Packaages

At present, it costs around $5,000 to place one kilogram of material into orbit. This means that if you’re floating on the International Space Station, you probably shouldn’t expect that much out of your meals. They’re calibrated to be nutritionally complete and lightweight, and while there’s some effort made to provide nice things like “flavor” and “texture,” that’s not always possible.

Let’s imagine that you want to be nice and send a cosmic care package up to the ISS with some sweets for the crew to share. Every penny matters at five grand a kilo, so you’ll want to be very sure that what you send up into the vast abyss will actually make the folks there happy. You have a list of the desserts that each individual ISS crew member would enjoy. What’s the smallest collection of treats you could send up that would ensure everyone gets something they like?

For example, suppose the ISS crew members have the following preferences, with each row representing one person’s cravings:

{ "Pumpkin pie", "Revani", "Sufganiyot" }
{ "Castella", "Kheer", "Melktert" }
{ "Po'e", "Tangyuan", "Tres leches" }
{ "Apfelstrudel", "Tangyuan" }
{ "Alfajores", "Brigadeiro", "Revani" }
{ "Baklava", "Revani" }
{ "Melktert" }
{ "Brownies", "Cannolis", "Castella", "Po'e", "Revani" }
{ "Cendol", "Kashata", "Tangyuan" }
{ "Baklava", "Castella", "Kheer", "Revani" }

The smallest care package that would cheer up the whole crew would have three items: revani, melktert,and tangyuan. Anything smaller than this will leave someone unhappy.

Write a function:

Set<string> smallestCarePackageFor(const Vector<Set<string>>& prefer-ences);

that takes as input a list containing sets of what treats each crew member would like, then returns the smallest care package that would make everyone on the ISS happy.

While in principle you could solve this problem by listing off all possible subsets of treats and seeing which ones satisfy everyone’s sweet tooths, this approach is probably not going to be very fast if there area lot of different choices for sweets. Instead, use the following approach: pick a person who hasn’t yet had anything sent up that would make them happy, then consider each way you could send them something that would fill them with joy.

Some notes on this problem:

  • Yep, you guessed it! You have to do this one recursively.
  • To clarify, if you send up some item to space, you can assume there’s enough of it for everyone on the space station to share. Sending up revani, for example, means sending up enough revani for each crew member to have a piece.
  • Each person’s preferences will contain at least one item. There is no fixed upper size to how many preferences each person can have.
  • If there are multiple care packages that are tied for the smallest, you can return any of them.
  • In case it helps, you can assume no treat’s name is the empty string.

Here’s one possibility. This solution works by trying all ways of providing the first person something they like, removing from the list of people everyone who now has something they like, then trying to cover the remaining people as efficiently as possible.

Set<string> smallestCarePackageFor(const Vector<Set<string>>& preferences) {
    /* Base case: If there are no people, we need no treats. */
    if (preferences.isEmpty()) {
        return {};
    }

    /* Track the best option so far. */
    Set<string> result; // Empty set is a sentinel.

    /* Otherwise, pick a person, then try all ways of sending them something. */
    for (string option: preferences[0]) {
        /* Try sending this option. That will cover everyone who also likes this
         * option.
         */
        Vector<Set<string>> remaining;
        for (auto person: preferences) {
            if (!person.contains(option)) {
                remaining += person;
            }
        }

        /* See what this would require us to send. That’s what’s needed to cover
         * everyone else, plus this option.
         */
        auto bestWithThis = smallestCarePackageFor(remaining) + option;

        /* See if this is better than the best option so far. */
        if (result.isEmpty() || result.size() > bestWithThis.size()) {
            result = bestWithThis;
        }
    }

    return result;
}

Here’s another option. This works by explicitly keeping track of what treats we’ve sent up. We then look at the first person and either (1) do nothing, because they’re already covered, or (2) send something for them, because they aren’t.

/* Given we've already picked set of treats chosen in soFar, what is the smallest
 * care package we can send that makes everyone from position nextPerson and
 * onward happy?
 */
Set<string> smallestCarePackageRec(const Vector<Set<string>>& preferences,
                                   int nextPerson,
                                   const Set<string>& soFar) {
    /* Base case: If we've satisfied everyone, return our decisions so far. */
    if (nextPerson == preferences.size()) return soFar;

    /* Base case: If this person is already happy, we don't need to do anything
     * for them.
     */
    for (string treat: soFar) {
        if (preferences[nextPerson].contains(treat)) {
            return smallestCarePackageRec(preferences, nextPerson + 1, soFar);
        }
    }

    /* Recursive case: They're not happy. Try all ways of making them happy and
     * take the best one.
     */
    Set<string> best; // Initially empty; this is a sentinel.

    for (string treat: preferences[nextPerson]) {
        auto bestWithThis = smallestCarePackageRec(preferences, nextPerson + 1,
                                                   soFar + treat);
        if (best.isEmpty() || best.size() > bestWithThis.size()) {
            best = bestWithThis;
        }
    }

    return best;
}
Set<string> smallestCarePackageFor(const Vector<Set<string>>& preferences) {
    return smallestCarePackageRec(preferences, 0, {});
}

Why we asked this question: We included this question for a number of reasons. First, we wanted to give you a chance to show us what you’d learned about recursive optimization. Although you haven’t seen this exact recursive approach on the assignments, you have seen an optimization problem with a similar setup (Shift Scheduling, where the goal was to find the best set of some type), and we hoped this would be a venue for you to demonstrate what you’d learned in the course of solving that problem.

Second, we wanted to give you an opportunity to demonstrate your skill in translating an abstract description of a recursive solution – here, “pick an unsatisfied person and try all ways of sending them treats” – into an actual recursive tree exploration.

Common mistakes: We saw three general categories of mistakes on this problem. First, there were issues associated with the recursion tree. For example, many solutions worked by picking a person and then trying every way of sending something up for them, but didn’t account for the fact that the person might already be covered. This doesn’t produce wrong answers, but it does dramatically increase the amount of work that needs to be done. Specifically, the decision tree gets much bigger, both because there are more decisions to make (we have to consider how to cover people who were already covered) and because the branching factor is higher (we explore many options that might not have been explored had we recognized that the person was covered).

Similarly, we saw some solutions that did a double-for loop inside the recursive step, once over people and once over treats, with the idea of trying all ways of picking a person and then all ways of sending something for them. This approach, again, isn’t incorrect, but it’s highly inefficient. Think of the shape of the recursion tree in this case – this says that, at each point in the tree, you have to pick both a per-son and a treat, so the branching factor gets much, much higher, dramatically increasing the amount of work that needs to be done.

We also saw some solutions that basically followed the outline of the code above, but which added an extra branch inside the for loop by trying out two options – both giving the person the specific treat and not giving the person that treat. This can potentially introduce some issues downstream. For ex-ample, what happens if you always choose not to give treats? This risks not covering everyone and requires extra logic for validation at the end of the recursion. But, more importantly, it introduces a large number of unnecessary branches into the recursion tree. Think of it this way – the question is not “do I give this treat to this person?,” but rather “which treat do I give this person?” By looking at things the first way, you dramatically increase the number of cases to check, since it turns what would normally be a choice-based recursion (“which one do I pick?”) into a subsets-based recursion (“which combination of these do I pick?”)

The next category of errors we saw were issues with implementing recursive optimization. Many solutions included a return statement inside of the loop over all treats, along the lines of what’s shown here:

for (string treat: preferences[person]) {
    /* … mirth and whimsy … */           

    return someSolution;                 
}                                        

The problem here is that this return statement prevents the for loop from running multiple iterations. It’ll stop as soon as the first one finishes, returning whatever solution was produced there. This is a common error in recursive problem-solving, and it’s important one to keep an eye out for down the line.

Many of you realized, correctly, that you need to have some logic to keep track of what the best care package is. Our solution uses the empty set as a sentinel, but other strategies included having a separate variable for the size of the set and initially giving it some large value (say, INT_MAX). While many of these approaches work, some solutions attempted to address this issue in a way that didn’t work correctly. Some solutions left the set uninitialized and forgot that this would make the set appear to be extremely small. Other solutions tried to give the Set an illegal value (say, -1), which wouldn’t compile.

The last major class of error we saw was, unfortunately, not using the strategy required by the problem. Many solutions approached this problem as a pure subsets problem, which we’d said in the problem statement was too slow to be an effective solution. It was difficult for us to award partial credit to those solutions because by going down that route, the solution missed out on several of the specific details we were looking for (finding an uncovered person, tracking the best set across all iterations of the loop, etc.).

If you had trouble with this problem: The steps to take to improve if this problem tripped you up will depend on what specific areas were giving you difficulty.

If you had trouble figuring out the shape of the recursion tree, start off by identifying how you approach these sorts of recursion problems. If you’re primarily approaching these problems by asking “is this subsets, combinations, or permutations?,” then you’ve made quite a bit of progress from where we’ve started (great!), but may need a bit more practice to handle questions that generalize beyond these patterns. Start off by looking back over the decision trees for those patterns. What do those trees look like? How do we explore those trees in code? Can you see how the specific code patterns you’re used to for those patterns follow from the tree? Once you’ve done that, take a second stab at this problem. Draw the recursion tree – or at least, a small piece of the tree. If you’re able to do that, great! If not, come talk to us in office hours or in the CLaIR. Once you have the tree, try writing the code for this one a second time.

If you had an issue with returning too early, or you accidentally mishandled the sentinel value that occurs inside the loop, we recommend the following. As you’re writing out a recursive function, there’s a good deal of work to do to figure out what the recursive calls should be simply to enumerate all the options. But, fundamentally, these optimization-type problems boil down to “call some function lots of times and return the best one.” Pretend, for a minute, that you’re not writing a recursive function and that the call you’re making is to some totally unrelated function. Then ask: if I wanted to capture the best value returned by this function, what would I do? Decoupling the recursion bit from the optimization bit might make things a lot easier to solve.


12) Mmmmm… Pancakes!

Bill Gates is famous for a number of things – for his work at Microsoft, for his wide-scale philanthropy through the Bill and Melinda Gates Foundation, and, oddly enough, for a paper he co-authored about flip-ping pancakes.

Imagine you have a stack of pancakes of different diameters. In an ideal pancake stack, the kind of pancake stack you’d see someone photograph and put on Instagram, the pancakes would be stacked in decreasing order of size from bottom to top. That way, all the delicious syrup and melted butter you put ontop of the pancakes can drip down onto the rest of the stack. Otherwise, you’ll have a bigger pancake ontop of a smaller one, serving as an edible umbrella that shields the lower pancakes from all the deliciousness you’re trying to spread onto them.

The problem with trying to get a stack of pancakes in sorted order is that it’s really, really hard to pull a single pancake out of a stack and move it somewhere else in the stack. On the other hand, it’s actually not that difficult to slip a spatula under one of the pancakes, pick up that pancake and all the pancakes above it, then flip that part of the stack over. For example, in the illustration below, we can flip the top three pancakes in the stack over, then flip the top five pancakes of the resulting stack over:

Three stacks of pancakes. The first stack has the largest pancakes on the bottom, the smallest pancakes in the middle, and the rest of the pancakes on the top with the largest of the medium sized pancakes at the bottom of the top group of pancakes and the smallest of the medium sized pancakes at the very top of the stack. The middle stack of pancakes shows what happens after flipping the medium sized pancakes upside down. The bottom two pancakes are still the largest, and now the rest of the stack increasingly gets larger as you go up. The last stack is perfectly ordered with the largest pancake on the bottom, and the smallest pancake on top.

Your task is to write a function

bool canSortStack(Stack<double> pancakes, int numFlips, Vector<int>& flipsMade);

that accepts as input a Stack<double> representing the stack of pancakes (with each pancake represented by its width in cm) and a number of flips, then returns whether it’s possible to get the stack of pancakes into sorted order making at most the specified number of flips. If so, your function should fill in the flipsMade out parameter with the sequence of flips that you made, with each flip specified as how many pancakes off the top of the stack you flipped. For example, the above illustration would correspond to the flip sequence {3, 5}, since we first flipped the top three pancakes, then the top five pancakes.

Here are some notes on this problem:

  • Treat this as a backtracking problem – try all possible series of flips you can make and see whether you can find one that gets everything into sorted order. As a result, don’t worry about efficiency.
  • Your solution must be recursive – again, that’s what we’re trying to test.
  • You can assume the pancakes all have different widths and that those widths are positive.
  • The Stack here is ordered the way the pancakes are – the topmost pancake is at the top of the stack, and the bottommost pancake is at the bottom.

You can assume flipsMade is empty when the function is first called, and its contents can be whatever you’d like them to be if you can’t sort the pancake stack in the given number of flips.

The key idea behind this problem is to follow the hint suggested and to try all possible flips at each point in time. If we ever get the stack sorted, great! We’re done. If we’re out of moves, then we report failure.

/**
 * Given a stack of pancakes, reports whether it’s sorted.
 *
 * @param pancakes The stack of pancakes.
 * @return Whether it’s sorted.
 */
bool isSorted(Stack<double> pancakes) {
    double last = -1; // No pancake has negative size, fortunately!
    while (!pancakes.isEmpty()) {
        if (last > pancakes.peek()) return false;
        last = pancakes.pop();
    }
    return true;
}

/**
 * Given a stack of pancakes and a number of pancakes to flip, returns the stack
 * you’d get by flipping that many pancakes off the top of the stack.
 *
 * @param pancakes The initial stack of pancakes.
 * @param toFlip How many pancakes to flip.
 * @return The resulting stack.
 */
Stack<double> flipTopOf(Stack<double> stack, int toFlip) {
    Queue<double> stash;
    for (int i = 0; i < toFlip; i++) {
        stash.enqueue(stack.pop());
    }
    while (!stash.isEmpty()) {
        stack.push(stash.dequeue());
    }
    return stack;
}

bool canSortStack(Stack<double> pancakes, int numFlips, Vector<int>& flipsMade) {
    /* Base case 1: If the stack is already sorted, we’re done! */
    if (isSorted(pancakes)) return true;

    /* Base case 2: If we’re out of flips, we fail! */
    if (numFlips == 0) return false;

    /* Otherwise, try all possible flips. We skip the option of flipping just the
     * top pancake, since that doesn’t actually accomplish anything.
     */
    for (int i = 2; i <= pancakes.size(); i++) {
        if (canSortStack(flipTopOf(pancakes, i), numFlips  1, flipsMade)) {
            flipsMade.insert(0, i);
            return true;
        }
    }
    return false;
}    

Why we asked this question: We included this question as an example of a recursive backtracking problem that didn’t easily fit into any of the molds we’d covered so far (subsets, permutations, combinations, and partitions). Our hope was that you’d use the general decision tree framework to deter-mine what options were available at each step, then combine that with your knowledge of recursive backtracking to put together a solution that tried all options in search of one that would get the stack sorted. We also included this question to make sure you were familiar with container types that weren’t exercised in the previous parts of the exam. This question requires you to use the Stack type, and while it wasn’t necessary to do so, we hoped you’d recognize the connection to Queues as well.

Common mistakes: One of the most common mistakes we saw people make on this problem was writing a for loop like this one:

for (int i = 0; i < pancakes.size(); i++) {
        // do something that pops an element off the pancakes stack
    }

This loop won’t actually loop over all the pancakes in the stack. If you’re not sure why this is, check out Section Handout 2, which explores why this is. We also saw a number of solutions that had the right base cases, but checked them in the wrong order. This can lead the code to accidentally return false in cases where there is a valid series of flips. Finally, we saw a large number of solutions that produced the final Vector in the reverse of the order in which it should have been produced.

If you had trouble with this problem: There are essentially four parts to this problem: identifying the recursive insight (that at each step you have some number of pancakes that you need to flip), finding the right base cases (the stack being sorted and no flips being left), structuring the backtracking properly (trying out each option and returning failure only if none of them work), and properly manipulating the stack (finding a way to flip the top without messing up the stack from option to option.)

If you had trouble developing the recursive insight – namely, to flip each possible number of pancakes – we recommend getting more practice with the decision tree framework. Although many of the recursive problems we’ve talked about this quarter nicely fall into clean categories like subsets or permutations, the most general pattern we’ve explored for these sorts of problems is the decision tree model. If you’re still a little shaky on this idea, we’d recommend revisiting some of the recursive functions from earlier in the quarter to make sure that you have a clear sense of how decision trees factor in. This is something you can absolutely pick up with practice. Once everything clicks, problems like these become a lot easier to solve.

The base cases in this problem are definitely trickier than in the other recursive problems on this exam. One technique we’d recommend choosing a good base case is to look at the function you’re writing and make sure you understand, very specifically, what the function is supposed to return given its arguments. This function tries to see if it’s possible to get something into sorted order under a certain resource constraint (a number of moves). So what happens if you’re already at the goal (it’s sorted)? Or you’re out of resources (you have no flips left?) Look back at some of the other backtracking problems you’ve worked through this quarter and see if you can ask similar questions like these about them.

Finally, if you were a bit shaky on the stack manipulation logic, we’d recommend working through some extra practice problems on stacks and queues. For example, could you write a function to use stacks and queues to reverse a stack or reverse a queue? How about to turn the bottom half of a stack upside down?


13) PetsMatch

Suppose that you own a pet store and want to help customers find their ideal pet. To do so, you create a list of all the pets in the store, along with all of the adjectives which best describe them. For example, you might have this list of pets and their adjectives:

Ada           Happy, Loving, Fuzzy, Big, Tricksy
Babbage       Loving, Tiny, Energetic, Quiet
Lucy          Happy, Loving, Big
Larry         Happy, Fuzzy, Tiny, Tricksy
Abby          Loving, Big, Energetic
Luba          Happy, Loving, Tiny, Quiet, Tricksy
Mitzy         Fuzzy, Big, Energetic, Quiet

If a customer gives you a list of adjectives, you can then recommend to them every pet that has all of those adjectives. For example, given the adjectives “Happy” and “Tricksy,” you could recommend Ada, Larry, and Luba. Given the adjective “Fuzzy,” you could recommend Ada, Larry, and Mitzy. However, given the adjectives “Energetic,” “Quiet,” and “Fuzzy,” and “Tiny,” you could not recommend any pets at all.

Write a function:

Set<string> petsMatching(const Map<string, Set<string>>& adjectiveMap,
                         const Vector<string>& requirements);

that accepts as input a Map<string, Set<string>> associating each pet with the adjectives that best describe it, along with a customer's requested adjectives (represented by a Vector<string>), then returns a Set<string>holding all of the pets that match those adjectives.

There might not be any pets that match the requirements, in which case your function should return an empty set. You can assume that all adjectives have the same capitalization, so you don't need to worry about case-sensitivity. Also, if a client does not include any adjectives at all in their requirements, you should return a set containing all the pets, since it is true that each pet matches all zero requirements. Finally, your function should not modify any of the input parameters.

Here’s one possible solution:

/* Given a set of adjectives, does it meet all the requirements? */
bool hasAllRequirements(const Set<string>& adjectives,
                        const Vector<string>& requirements) {
    for (string req: requirements) {
        if (!adjectives.contains(req)) {
            return false;
        }
    }
    return true;
}

Set<string> petsMatching(const Map<string, Set<string>>& adjectiveMap,
                         const Vector<string>& requirements) {
    Set<string> result;
    for (string pet: adjectiveMap) {
        if (hasAllRequirements(adjectiveMap[pet], requirements)) {
            result += pet;
        }
    }
    return result;
}

Why we asked this question: This question was designed to get you playing around with different container and collection types (here, Vector, Map, and Set) and to work with nested containers. This problem is also significantly easier to solve if you write a helper function – the logic to handle everything when you don’t have a separate helper to check if all the requirements are met is tricky!


14) Shrinkable Words Revisited

Recall from lecture that a shrinkable word is a word that can be reduced down to a single letter by removing one letter at a time, at each step leaving a valid English word.

In lecture, we wrote a function isShrinkableWord that determined whether a given string was a shrinkable word. Initially, this function just returned true or false. This meant that if a word was indeed shrinkable, we would have to take on faith that the word could indeed be reduced down to a single letter.

To help explain why a word was shrinkable, our second version of the function additionally produced a Stack<string> showing the series of words one would go through while shrinking the word all the way down to a single letter. (In reality, our lecture example used a Vector<string> rather than a Stack<string>, but a Stack<string> is actually a better choice.) Let's call a sequence of this sort a shrinking sequence.

However, let's suppose that you're still skeptical that the Stack<string> produced by the function actually is a legal shrinking sequence for the word. To be more thoroughly convinced that a word is shrinkable, you decide to write a function that can check whether a given Stack<string> is indeed a shrinking sequence for a given word.

Write a function:

bool isShrinkingSequence(const string& word,    
                         const Lexicon& english,
                         Stack<string> path);   

that accepts as input a word, a Lexicon containing all words in English, and a Stack<string> containing an alleged shrinking sequence for that word, then returns whether the Stack<string> is indeed a legal shrinking sequence for that word. For example, given the word "pirate" and the stack

pirate
irate
rate
rat
at
a
_________

Your function would return true. However, given any of these stacks:

pirate
irate
rate
ate
te
a
_________
avast
vast
vat
at
a
_________
pirate
rate
at
_________
_________

Your function would return false (the first stack is not a shrinking sequence because "te" and "e" are not words; the second is a legal shrinking sequence, but not for the word "pirate"; the third is not a shrinking sequence because it skips words of length 1, 3, and 5; and the fourth is not a shrinking sequence because it contains no words at all).

You can assume that word and all the words in path consist solely of lower-case letters.

Feel free to tear out this page as a reference, and write your solution on the next page.

Here’s one possible solution, though we saw many others:

bool differsByOneLetter(const string& longer, const string& shorter) {
    for (int i = 0; i < longer.size(); i++) {
        if (longer.substr(0, i) + longer.substr(i + 1) == shorter) {
            return true;
        }
    }
    return false;
}

bool isShrinkingSequence(const string& word,
                         const Lexicon& words,
                         Stack<string> path) {
    /* Base Case 1: If the stack is empty, this can't be a shrinking sequence. */
    if (path.isEmpty()) {
        return false;
    }

    /* Base Case 2: If the word parameter isn't a word, the path can't possibly
     * be a shrinking path.
     */
    if (!words.contains(word)) {
        return false;
    }

    /* Get the top of the stack and confirm that it matches the word. */
    string top = path.pop();
    if (top != word) {
        return false;
    }

    /* If the word is just one character, the stack should be empty. */
    if (word.length() == 1) {
        return path.isEmpty();
    }

    /* If the stack is empty, then this can't be a shrinking path because we
     * would have returned in the previous step.
     */
    if (path.isEmpty()) {
        return false;
    }

    /* Confirm that the top word differs from the current word by exactly one
     * letter.  If not, this isn't a shrinking sequence.
     */
    if (!differsByOneLetter(word, path.peek()) {
        return false;
    }

    /* This is a shrinking sequence iff the remainder of the stack is a shrinking
     * sequence for the top of the stack.
     */
    return isShrinkingSequence(path.peek(), words, path);
}

Why we asked this question: This question was designed to get you engaging with a different set of containers (here, Stack and Lexicon) and to give you a chance to show off your problem-solving skills. There are a number of little details to check for, and we figured this would be a good way to let you demonstrate what you’ve learned about problem-solving in C++.

Common mistakes: This problem is a lot trickier than it initially appears and there are many different cases to check for. Common mistakes included not checking that the stack wasn't empty before peeking or popping it; forgetting to check that the words in the stack were English words; counting the empty string, which isn't a word, as a shrinkable word; incorrectly checking whether adjacent words in the stack differed by exactly one letter; or forgetting to check that the stack ends on a string of length one. A subtle but common error in this problem was writing a loop like this one:

    for (int i = 0; i < path.size(); i++) { // <-- Careful! Doesn’t work!
        string top = path.pop();
        /* … */
    }

This code appears to iterate over the contents of the stack, but has a slight error in it. Specifically, note that on each iteration, the value of i increases and the value of path.size() decreases, so the total number of iterations is roughly half what it should be.


15) Ddeeddoouubblliinngg

(This excellent problem by Eric Roberts.)

In the early part of the 20th century, there was considerable interest in both England and the United States in simplifying the rules used for spelling English words, which has always been a difficult proposition. One suggestion advanced as part of this movement was the removal of all doubled letters from words. If this were done, no one would have to remember that the name of the Stanford student union is spelled “Tresidder,” even though the incorrect spelling “Tressider” occurs at least as often. If doubled letters were banned, everyone could agree on “Tresider.”

Write a recursive function

string removeDoubledLetters(const string& str);

that takes a string as its argument and returns a new string with any consecutive substring consisting of repeated copies of the same letter replaced by a single copy letter of that letter. For example, if you call

removeDoubledLetters("tresidder")

your function should return the string "tresider". Similarly, if you call

removeDoubledLetters("bookkeeper")

your function should return "bokeper". And because your function compresses strings of multiple letters into a single copy, calling

removeDoubledLetters("zzz")

should return "z".

In writing your solution, you should keep the following points in mind:

  • Your function should not try to consider the case of the letters. For example, calling the function on the name "Lloyd" should return the argument unchanged because 'L' and 'l' are different letters.
  • Your function must be purely recursive and may not make use of any iterative constructs such as for or while.

There are many ways to solve this problem, but the one we think is the most elegant is shown here.

string removeDoubledLetters(const string& str) {
    /* Base case: Any string with one or fewer letters has no duplicates. */
    if (str.length() <= 1) {
        return str;
    }
    /* Recursive case: Either the first letter is doubled or it isn't. If it's
     * doubled, then the string we want is formed by dropping the first letter
     * and then deduplicating the rest of the string.
     */
    else if (str[0] == str[1]) {
        return removeDoubledLetters(str.substr(1));
    }
    /* If that letter isn't doubled, include it in the result and deduplicate the
     * remaining characters.
     */
    else {
        return str[0] + removeDoubledLetters(str.substr(1));
    }
}

Why we asked this question: This question is designed to get you thinking about the insights and problem-solving strategies that come up when thinking recursively. This isn’t a branching recursion problem, and there’s no need to pass down extra state through the chain. Really, it’s about seeing how the larger version of the problem reduces down to smaller versions.


16) A Question of Balance

Your job in this problem is to write a function

Set<string> balancedStringsOfLength(int n);

that accepts as input a nonnegative number n, then returns a Set<string> containing all strings of exactly n pairs of balanced parentheses.

As examples, here is the one string of one pair of balanced parentheses:

()

Here are the two strings of two pairs of balanced parentheses:

(())

()()

Here are all five strings of three pairs of balanced parentheses:

((()))

(()())

(())()

()(())

()()()

And here are the fourteen strings of four pairs of balanced parentheses:

(((())))

((()()))

((())())

(()(()))

(()()())

((()))()

(()())()

(())(())

(())()()

()((()))

()(()())

()(())()

()()(())

()()()()

As a hint, you might find the following observation useful: any string of n > 0 pairs of balanced parentheses can be split apart as follows:

( some-string-of-balanced-parentheses ) another-string-of-balanced-parentheses

You might find it useful to try splitting apart some of the above strings this way to see if you understand why this works.

Write your solution on the next page, and feel free to tear out this page as a reference.

The solution shown here follows the hint from the problem statement. The insight needed here is that if you have a string of n balanced parentheses, then either

  • n is zero, in which case your string is empty, or
  • n is nonzero. Look at the first open parenthesis and find its matching close parenthesis. What’s between them must be a pair of balanced parenthesis, and let’s suppose there are k pairs in that inner part. That means there are n – k – 1 pairs of parentheses that appear after the close parenthesis we identified.

With that in mind, we just need to assemble all the strings formed this way. Here’s one way to do this:

Set<string> balancedStringsOfLength(int n) {
    /* Base Case: There is exactly one balanced string with zero parentheses. */
    if (n == 0) {
        return { "" };
    }
    /* Recursive Case: Some number of parentheses go on the inside of the first
     * pair, and some go afterward. Try all combinations and see what we get.
     */
    else {
        Set<string> result;

        for (int k = 0; k < n; k++) {
            for (string inner: balancedStringsOfLength(k)) {
                for (string outer: balancedStringsOfLength(n - k - 1)) {
                    result += "(" + inner + ")" + outer;
                }
            }
        }
        return result;
    }
}

This initial version works, but it’s fairly slow because it recomputes balancedStringsOfLength multiple redundant times. A better approach is to compute each of the recursive calls exactly once and then aggregate things together at the end. Here’s an alternative version of the recursive step:

    else {
        /* Produce all balanced strings of shorter lengths. */
        Vector<Set<string>> balanced;
        for (int i = 0; i < n; i++) {
            balanced += balancedStringsOfLength(i);
        }

        /* Glue things together. */
        Set<string> result;
        for (int k = 0; k < n; k++) {
            for (string inner: balanced[k]) {
                for (string outer: balanced[n - k - 1]) {
                    result += "(" + inner + ")" + outer;
                }
            }
        }
        return result;
    }

Why we asked this question: This question was designed to get you playing around with a novel recursive insight and how that translates into code. The key insight you need here involves looking at strings of balanced parentheses as having a particular pattern, and from there the question is how to take that insight and use it to write a recursive solution.

The strategy used here is a bit different than what we’ve done so far. Rather than passing down a single solution through the recursion chain, we have our function instead return up to us all solutions to the simpler problem, then combine them together in different ways to assemble all of the overall solutions.