Section 2. ADTs and Big-O


Section materials curated by Trip Master, drawing upon materials from previous quarters.

This week’s section exercises explore ADT and Big-O problems. By the end of this week, you'll be well-versed in all kinds of ADTs, and you'll be able to think critically about program runtime. Runtime? More like fun-times, am I right?

Remember that every week we will also be releasing a Qt Creator project containing starter code and testing infrastructure for that week's section problems. When a problem name is followed by the name of a .cpp file, that means you can practice writing the code for that problem in the named file of the Qt Creator project. Here is the zip of the section starter code:

📦 Starter code

1) References Available Upon Request

Topic: Reference parameters, range-based for loops

Reference parameters are an important part of C++ programming, but can take some getting used to if you’re not familiar with them. Trace through the following code. What does it print?

void printVector(Vector<int>& values) {
    for (int elem: values) {
        cout << elem << " ";
    }
    cout << endl;
}

void maui(Vector<int> values) {
    for (int i = 0; i < values.size(); i++) {
        values[i] = 1258 * values[i] * (values[2] - values[0]);
    }
}

void moana(Vector<int>& values) {
    for (int elem: values) {
        elem *= 137;
    }
}

void heihei(Vector<int>& values) {
    for (int& elem: values) {
        elem++;
    }
}

Vector<int> teFiti(Vector<int>& values) {
    Vector<int> result;
    for (int elem: values) {
        result += (elem * 137);
    }
    return result;
}

int main() {
    Vector<int> values = { 1, 3, 7 };
    maui(values);
    printVector(values);
    moana(values);
    printVector(values);
    heihei(values);
    printVector(values);
    teFiti(values);
    printVector(values);
    return 0;
}

Here’s the output from the program:

1 3 7
1 3 7
2 4 8
2 4 8

Here’s a breakdown of where this comes from:

  • The maui function takes its argument by value, so it’s making changes to a copy of the original vector, not the vector itself. That means that the values are unchanged back in main.
  • The moana function uses a range-based for loop to access the elements of the vector. This makes a copy of each element of the vector, so the changes made in the loop only change the temporary copy and not the elements of the vector. That makes that the values are unchanged back in main.
  • heihei, on the other hand, uses int& as its type for the range-based for loop, so in a sense it’s really iterating over the elements of the underlying vector. Therefore, its changes stick.
  • The teFiti function creates and returns a new vector with a bunch of updated values, but the return value isn’t captured back in main.

2) SumNumbers (sum.cpp)

Topics: Vectors, strings, file reading

The function sumNumbers reads a text file and sums the numbers found within the text. Here are some library functions that will be useful for this task:

In particular you will be asked to write the following function

int sumNumbers(string filename)

When given the following file, named numbers.txt, as input, your function should return 42.

42 is the Answer to the Ultimate Question of Life, the Universe, and Everything
This is a negative number: -9
Welcome to CS106B!
I want to own 9 cats.
bool isNumber(string s)
{
    // strip negative sign off negative numbers
    if (s.length() > 0 && s[0] == '-') {
        s = s.substr(1);
    }
    for (char ch : s) {
        if (!isdigit(ch)) return false;
    }
    return s.length() > 0;
}

int sumNumbers(string filepath)
{
    ifstream in;
    Vector<string> lines;
    int sum = 0;

    if (!openFile(in, filepath)) {
        return 0;
    }

    readEntireFile(in, lines);
    for (string line : lines) {
        Vector<string> tokens = stringSplit(line, " ");
        for (string t : tokens) {
            if (isNumber(t)) {
                sum += stringToInteger(t);
            }
        }
    }
    return sum;
}

3) Debugging Deduplicating (deduplicate.cpp)

Topics: Vector, strings, debugging

Consider the following incorrect C++ function, which accepts as input a Vector<string> and tries to modify it by removing adjacent duplicate elements:

⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️

void deduplicate(Vector<string> vec) {
    for (int i = 0; i < vec.size(); i++) {
        if (vec[i] == vec[i + 1]) { 
            vec.remove(i);
        }
    }
}

⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️⚠️

The intent behind this function is that we could do something like this:

Vector<string> hiddenFigures = {
    "Katherine Johnson",
    "Katherine Johnson",
    "Katherine Johnson",
    "Mary Jackson",
    "Dorothy Vaughan",
    "Dorothy Vaughan"
};

deduplicate(hiddenFigures);
// hiddenFigures = ["Katherine Johnson", "Mary Jackson", "Dorothy Vaughan”]

The problem is that the above implementation of deduplicate does not work correctly. In particular, it contains three bugs. First, find these bugs by writing test cases that pinpoint potentially erroneous situations in which the provided code might fail, then explain what the problems are, and finally fix those errors in code.

There are three errors here:

  1. Calling .remove() on the Vector while iterating over it doesn’t work particularly nicely. Specifically, if you remove the element at index i and then increment i in the for loop, you’ll skip over the element that shifted into the position you were previously in.
  2. There’s an off-by-one error here: when i = vec.size() - 1, the indexing vec[i + 1] reads off the end of the Vector.
  3. The Vector is passed in by value, not by reference, so none of the changes made to it will persist to the caller.

Here’s a corrected version of the code:

void deduplicate(Vector<string>& vec) {
    for (int i = 0; i < vec.size() - 1; ) {
        if (vec[i] == vec[i + 1]) {
            vec.remove(i);
        } else {
            i++;
        }
    }
}

[Credit to Andrew Tierno for the alternate solution] Alternatively, you can also re-write the function to use a loop that traverses the vector from right-to-left, which is a common pattern when working with deleting items from linear collections. A solution that does so could look like this:

void deduplicate(Vector<string>& vec) {
    for (int i = vec.size() - 1; i > 0; i--) {
        if (vec[i] == vec[i - 1]) {
            vec.remove(i);
        }
    }
}

4) Mirror (mirror.cpp)

Topic: Grids

Write a function mirror that accepts a reference to a Grid of integers as a parameter and flips the grid along its diagonal. You may assume the grid is square; in other words, that it has the same number of rows as columns. For example, the grid below that comes first would be altered to give it the new grid state shown afterwards:

Original state: 
{ { 6, 1, 9, 4},                
  {-2, 5, 8, 12},                  
  {14, 39, -6, 18},             
  {21, 55, 73, -3} }               

Mirrored state: 
 { {6, -2, 14, 21},
   {1, 5, 39, 55},
   {9, 8, -6, 73},
   {4, 12, 18, -3} }

Bonus: How would you solve this problem if the grid were not square?

// solution
void mirror(Grid<int>& grid) {
    for (int r = 0;r < grid.numRows(); r++) {
        // start at r+1 rather than 0 to avoid double-swapping 
        for (int c = r + 1; c < grid.numCols(); c++) { 
            int temp = grid[r][c]; 
            grid[r][c] = grid[c][r]; 
            grid[c][r] = temp;
        } 
    }
}
// bonus 
void mirror(Grid<int>& grid) {
    Grid<int> result(grid.numCols(), grid.numRows());
    for (int r = 0; r < grid.numRows(); r++) {
        for (int c = 0; c < grid.numCols(); c++) {
            result[c][r] = grid[r][c];
        }
    }
    grid = result;
}

5) Check Balance (balance.cpp)

Topic: Stacks

Write a function named checkBalance that accepts a string of source code and uses a Stack to check whether the braces/parentheses are balanced. Every ( or { must be closed by a } or ) in the opposite order. Return the index at which an imbalance occurs, or -1 if the string is balanced. If any ( or { are never closed, return the string's length.

Here are some example calls:

//   index    0123456789012345678901234567
checkBalance("if (a(4) > 9) { foo(a(2)); }") 
// returns -1 (balanced)

//   index    01234567890123456789012345678901
checkBalance("for (i=0;i<a;(3};i++) { foo{); )")
// returns 15 because } is out of order

//   index    0123456789012345678901234
checkBalance("while (true) foo(); }{ ()")
// returns 20 because } doesn't match any {

//   index    01234567
checkBalance("if (x) {")
// returns 8 because { is never closed
int checkBalance(string code) {
    Stack<char> parens;
    for (int i = 0; i < (int) code.length(); i++) {
        char c = code[i];
        if (c == '(' || c == '{') {
        parens.push(c);
        } else if (c == ')' || c == '}') {
            if (parens.isEmpty()) {
                return i;
            }
            char top = parens.pop();
            if ((top == '(' && c != ')') || (top == '{' && c != '}')) {
                return i;
            }
        }
    }

    if (parens.isEmpty()) {
        return -1; // balanced
    } else {
        return code.length();
    }
}

6) Collection Mystery

Topics: Stacks and Queues

void collectionMystery(Stack<int>& s) 
{ 
    Queue<int> q;
    Stack<int> s2;

    while (!s.isEmpty()) {
       if (s.peek() % 2 == 0) {
            q.enqueue(s.pop()); 
        } else {
            s2.push(s.pop());
        }
    }
    while (!q.isEmpty()) {
        s.push(q.dequeue()); 
    }
    while(!s2.isEmpty()) { 
        s.push(s2.pop());
    }
    cout<< s << endl;
}

Write the output produced by the above function when passed each of the following stacks. Note that stacks and queues are written in ​front to back order, with the oldest element on the left side of the queue/stack.

Stacks:

{1, 2, 3, 4, 5, 6}                ________________________________________
{42, 3, 12, 15, 9, 71, 88}        ________________________________________
{65, 30, 10, 20, 45, 55, 6, 1}    ________________________________________

{6, 4, 2, 1, 3, 5}
{88, 12, 42, 3, 15, 9, 7}
{6, 20, 10, 30, 65, 45, 55, 1}

7) Friend List (friendlist.cpp)

Topic: Maps

Write a function named friendList that takes in a file name, reads friend relationships from the file, and writes them to a Map. friendList should return the populated Map. Friendships are bi-directional, so if Abby is friends with Barney, Barney is friends with Abby. The file contains one friend relationship per line, with names separated by a single space. You do not have to worry about malformed entries.

If an input file named buddies.txt looked like this:

Barney Abby
Abby Clyde

Then the call of friendList("buddies.txt") should return a resulting map that looks like this:

{"Abby":{"Barney", "Clyde"}, "Barney":{"Abby"}, "Clyde":{"Abby"}}

Here is the function prototype you should implement:

Map<string, Vector<string> > friendList(String filename)

Map<string, Vector<string> > friendList(string filename) {
    ifstream in;
    Vector<string> lines;

    if (openFile(in, filepath)) {
        readEntireFile(in, lines);
    }

    Map<string, Vector<string> > friends;
    for (string line: lines) {
        Vector<string> people = stringSplit(line, " ");
        string s1 = people[0];
        string s2 = people[1];
        friends[s1] += s2;
        friends[s2] += s1;
    }
    return friends;
}

8) Twice (twice.cpp)

Topic: Sets

Write a function named twice that takes a vector of integers and returns a set containing all the numbers in the vector that appear exactly twice.

Example: passing {1, 3, 1, 4, 3, 7, -2, 0, 7, -2, -2, 1} returns {3, 7}.

Bonus: do the same thing, but you are not allowed to declare any kind of data structure other than sets.

// solution
Set<int> twice(Vector<int>& v) {
    Map<int, int> counts;
    for (int i : v) {
        counts[i]++;
    }
    Set<int> twice;
    for (int i : counts) {
        if (counts[i] == 2) {
            twice += i;
        }
    }
    return twice;
}

// bonus
Set<int> twice(Vector<int>& v) {
    Set<int> once;
    Set<int> twice;
    Set<int> more;
    for (int i : v) {
        if (once.contains(i)) {
            once.remove(i);
            twice.add(i);
        } else if (twice.contains(i)) {
            twice.remove(i);
            more.add(i);
        } else if (!more.contains(i)) {
            once.add(i);
        }
    }
    return twice;
}

9) Reversing a Map (reverse.cpp)

Topic: Nested data structures

Write a function

Map<int, Set<string>> reverseMap(Map<string, int>& map)

that, given a Map<string, int> that associates string values with integers, produces a Map<int, Set<string>> that’s essentially the reverse mapping, associating each integer value with the set of strings that map to it. (This is an old job interview question from 2010.)

Here’s one possible implementation.

Map<int, Set<string>> reverseMap(Map<string, int>& map) {
    Map<int, Set<string>> result;
    for (string oldKey : map) {
        if (!result.containsKey(map[oldKey])) {
            result[map[oldKey]] = {};
        }
        result[map[oldKey]].add(oldKey);
    }
    return result;
}

10) Oh No, Big-O, Too Slow

Topics: Big-O, code analysis

Give a tight bound of the nearest runtime complexity class for each of the following code fragments in Big-Oh notation, in terms of the variable N.

Code Snippet A

int sum = 0;
for (int i = 1; i <= N + 2; i++) {
    sum++;
}
for (int j = 1; j <= N * 2; j++) {
    sum++;
}
cout << sum << endl;

Code Snippet B

int sum = 0;
for (int i = 1; i <= N - 5; i++) {
    for (int j = 1; j <= N - 5; j += 2) {
        sum++;
    }
}
cout << sum << endl;

Code Snippet C

int sum = 0;
for (int i = 0; i < 1000000; i++) {
    for (int j = 1; j <= i; j++) {
        sum += N;
    }
    for (int j = 1; j <= i; j++) {
        sum += N;
    }
    for (int j = 1; j <= i; j++) {
        sum += N;
    }
}
cout << sum << endl;
Code Snippet A has a runtime complexity of O(N).
Code Snippet B has a runtime complexity of O(N^2).
Code Snippet C has a runtime complexity of O(1).

11) More Big O

Below are five functions. Determine the big-O runtime of each of those pieces of code, in terms of the variable n.

void function1(int n) {
    for (int i = 0; i < n; i++) {
        cout << '*' << endl;
    }
}

void function2(int n) {
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            cout << '*' << endl;
        }
    }
}

void function3(int n) {
    for (int i = 0; i < n; i++) {
        for (int j = i + 1; j < n; j++) {
            cout << '*' << endl;
        }
    }
}

void function4(int n) {
    for (int i = 1; i <= n; i *= 2) {
        cout << '*' << endl;
    }
}

Finally, what is the big-O runtime of this function in terms of n, the number of elements in v?

int squigglebah(Vector<int>& v) {
    int result = 0;
    for (int i = 0; i < v.size(); i++) {
        Vector<int> values = v.subList(0, i);
        for (int j = 0; j < values.size(); j++) {
            result += values[j];
        }
    }
    return result;
}
  1. The runtime of this code is O(n): We print out a single star, which takes time O(1), a total of n times.
  2. The runtime of this code is O(n^2). The inner loop does O(n) work, and it runs O(n) times for a net total of O(n^2) work.
  3. This one also does O(n^2) work. To see this, note that the first iteration of the inner loop runs for n 1 iterations, the next for n 2 iterations, then n 3 iterations, etc. Adding all this work up across all iterations gives (n 1) + (n 2) + + 3 + 2 + 1 + 0 = O(n^2).
  4. This one runs in time O(log n). To see why this is, note that after k iterations of the inner loop, the value of i is equal to 2^k. The loop stops running when 2^k exceeds n. If we set 2^k = n, we see that the loop must stop running after k = log n steps.

    Another intuition for this one: the value of i doubles on each iteration, and you can only double O(log n) times before you overtake the value n.

For the final function, Let’s follow the useful maxim of "when in doubt, work inside out!"" The innermost for loop (the one counting with j) does work proportional to the size of the values list, and the values list has size equal to i on each iteration. Therefore, we can simplify this code down to something that looks like this:

int squigglebah(Vector<int>& v) {
    int result = 0;
    for (int i = 0; i < v.size(); i++) {
        Vector<int> values = v.subList(0, i);
        do O(i) work;
    }
    return result;
}

Now, how much work does it take to create the values vector? We’re copying a total of i elements from v, and so the work done will be proportional to i. That gives us this:

int squigglebah(Vector<int>& v) {
    int result = 0;
    for (int i = 0; i < v.size(); i++) {
        do O(i) work;
        do O(i) work;
    }
    return result;
}

Remember that doing O(i) work twice takes time O(i), since big-O ignores constant factors. We’re now left with this:

int squigglebah(Vector<int>& v) {
    int result = 0;
    for (int i = 0; i < v.size(); i++) {
        do O(i) work;
    }
    return result;
}

This is the same pattern as function2 in the previous problem, and it works out to O(n^2) total time.