Practice Diagnostic

This practice diagnostic has been specially developed for this quarter to give students a good model of the scope of the diagnostic and the content which will be covered. The real diagnostic will be very similar in structure (four problems, one about interpreting provided code and three about writing code in the areas of C++ Fundamentals, ADTs, and Recursion) but will obviously vary in specific content that is covered.

We have provided an approximate point breakdown for each problem with the exam totalling 45 points in all. We expect the exam to take roughly an hour to an hour and a half to complete, and you can use this point breakdown to help guide your time management on the diagnostic. The extra credit bonus problem at the end will be worth up to an additional 5 points, and the time to complete the bonus problem has not been factored into the estimated completion time.

The Bluebook file containing these questions can be downloaded using this link. If this link takes you to a page full of jibberish but does not download the file, just right-click somewhere on the page and select the "Save As…" option to save the contents of the page as a file on your computer.

Here is the note on exam logistics that would normally comprise the first page of a physical exam.

Intro Announcement

You are welcome to consult any non-human sources you would like when completing the diagnostic. You are not allowed to communicate with any other humans during the exam except the course staff. This means, for example, that it is perfectly permissible to visit the course website, to open Qt Creator and run code, and to Google search for anything you'd like. However, you are not allowed to communicate with other humans in any way during the exam, nor are you permitted to post the questions anywhere where other humans could see them or offer advice or suggestions on them (for example, by posting on a public Q&A website or emailing questions to other people). We will run your diagnostic answers through similarity detection software to ensure you have not received impermissible help on the assessment.

Unless otherwise indicated as part of the instructions for a specific problem, comments will not be required on the exam. Uncommented code that gets the job done will be sufficient for full credit on the problem. On the other hand, comments may help you get partial credit if they help us determine what you were trying to do. You do not need to worry about efficiency unless a problem states otherwise. You don’t need to prototype helper functions you write, and you don’t need to explicitly #include libraries you want to use.

This exam is “self-contained” in the sense that if you’re expected to reference code from the lectures, textbook, assignments, or section handouts, we’ll explicitly mention what that code is and provide sufficient context for you to use it.

Problem 1: Interpreting Code (10 points)

Part A: Analyzing Code Output and Runtime (7 points)

This question is designed to let you demonstrate what you’ve learned about algorithmic analysis and data structure design. Below are four functions. We picked one of those functions and ran it on many different values of n. We captured the output of that function on each of those inputs, along with the runtime. That information is printed in the table below.

Function 1

int function1(int n) { 
    Stack<int> values; 
    for (int i = 0; i < n; i++) { 
        values.push(i); 
    }
    int result; 
    while (!values.isEmpty()) { 
        result = values.pop(); 
    } 
    return result; 
} 

Function 2

int function2(int n) {
    Queue<int> values;
    for (int i = 0; i < n; i++) { 
        values.enqueue(i); 
    } 
    int result; 
    while (!values.isEmpty()) { 
        result = values.dequeue(); 
    } 
    return result; 
} 

Function 3

int function3(int n) { 
    Set<int> values; 
    for (int i = 0; i < n; i++) { 
        values.add(i); 
    } 
    int result;
    for (int value: values) { 
        result = value; 
    }
    return result; 
} 

Function 4

int function4(int n) { 
    Vector<int> values; 
    for (int i = 0; i < n; i++) { 
        values.add(i); 
    }
    int result; 
    while (!values.isEmpty()) { 
        result = values[0]; 
        values.remove(0); 
    }
    return result; 
} 

`n`	Time	Return Value
100,000	0.137s	99999
200,000	0.274s	199999
300,000	0.511s	299999
400,000	0.549s	399999
500,000	0.786s	499999
600,000	0.923s	599999
700,000	0.960s	699999
800,000	1.198s	799999
900,000	1.335s	899999
1,000,000	1.472s	999999

For each of these pieces of code, tell us its big-O runtime as a function of n. No justification is required. (4 points)
For each of these pieces of code, tell us whether that function could have given rise to the return values reported in the rightmost column of the table below. No justification is required. (1 point)
Which piece of code did we run? How do you know? Justify your answer in at most fifty words. (2 points)

Solution

The runtimes are as follows:
- function1 runs in time O(n).
- function2 runs in time O(n).
- function3 runs in time O(n log n).
- function4 runs in time O(n^2).

Some explanations: In function1, we do n pushes followed by n pops. The cost of each stack operation is O(1), so this means we’re doing 2n operations at an effective cost of O(1) each, for a net total of O(n). The same is true about queue operations: each one takes time O(1), which is why function2 takes time O(n) as well.

For function3, inserting an element into a set takes time O(log n). This means that the cost of inserting n elements is O(n log n). The cost of iterating over the set is also O(n) so the net runtime is O(n log n).

For function4, adding n elements to the end of a vector takes time O(n). However, removing from the front of a vector with n elements also takes time O(n), since we have to shift all the other elements back one position. This means that the overall runtime is O(n^2).

The answers: function1 cannot produce the given output. function2 will always produce this output. function3 will always produce this output. function4 will always produce this output.

Some explanations: In function1, since elements are stored in a stack, the last element popped is the first element pushed, which would always be zero. Therefore, we’d expect to see a column of zeros in the table, which doesn’t match what’s actually there.

In function2, the last element removed from the queue is the last element added to the queue, which, here, would be n – 1, matching the output.

In function3, the Set type stores its elements in sorted order. Iterating over the set, therefore, will visit the elements in ascending order, so the last element iterated over by the loop would be n – 1, matching the output.

Finally, in function4, we remove elements from the vector in the reverse order in which they’re added, matching the queue’s ordering and making the last element visited exactly n – 1.

First, notice that the runtime appears to be O(n); doubling the size of the inputs roughly doubles the runtime. That leaves function1 and function2 as choices, and function1 has the wrong return value. Therefore, we must have run function2.

Part B: Tracing a Recursive Function (3 points, 1 per part)

For each of the calls to the following recursive function below, indicate what output is produced:

void recursionMystery2(int n) { 
    if (n <= 1) { 
        cout << ": ";
    } else {
        cout << (n % 2) << " "; 
        recursionMystery2(n / 2); 
        cout << n << " "; 
    } 
}

Call	Output
a) `recursionMystery2(8);`
b) `recursionMystery2(25);`
c) `recursionMystery2(46);`

Solution

Call	Output
a) `recursionMystery2(8);`	`0 0 0 : 2 4 8`
b) `recursionMystery2(25);`	`1 0 0 1 : 3 6 12 25`
c) `recursionMystery2(46);`	`0 1 1 1 0 : 2 5 11 23 46`

Problem 2: C++ Fundamentals (10 points)

Write a function named printBox that accepts two parameters: a Vector<string> holding the lines of a file, and an int for a width. Your function should go through the lines and display them to the console with a 'box' border of ‘#’ characters around them on all four sides. The second parameter to your function indicates the total width of the box including its border. You must also convert each line to "title case" by capitalizing the first letter of the line and lowercasing all subsequent letters. For example, suppose the file poem.txt contains the following text:

Your skin like dawn
Mine like musk
 
One paints the beginning
of a certain end.
 
The other, the end of a
sure beginning.

-Maya Angelou

Notice that the file might contain blank lines, which will be represented by the empty string. The passed in Vector of lines would look like this:

{“Your skin like dawn”, “Mine like musk”, “”, “One paints the beginning”, “of a certain end”, “”, “The other, the end of a”, “sure beginning.”, “”, “-Maya Angelou”}

The following would be the outputs from two calls to your function:

printBox(lines, 26)

########################## 
#Your skin like dawn     # 
#Mine like musk          #
#                        # 
#One paints the beginning#
#Of a certain end.       #
#                        # 
#The other, the end of a #
#Sure beginning.         #
#                        # 
#-maya angelou           #
##########################

printBox(lines, 35)

################################### 
#Your skin like dawn              # 
#Mine like musk                   #
#                                 # 
#One paints the beginning         #
#Of a certain end.                #
#                                 # 
#The other, the end of a          #
#Sure beginning.                  #
#                                 # 
#-maya angelou                    #
###################################

You may assume that the width value passed is non-negative and that no line in the file is too wide to fit into a box of that width.

Fill in the function below:

void printBox(Vector<string> lines, int width) {
     /* FILL ME IN */
}

Solution

void printPoundLine(int width) {
    for (int i = 0; i < width; i++) {
        cout << "#";
    }
    cout << endl;
}

void printBox(Vector<string> lines, int width) {
    printPoundLine(width);
    for (string line: lines) {
        string output = "#"; 
        if (line != ""){
            output = output + toUpperCase(line.substr(0, 1)) + toLowerCase(line.substr(1));
        }
        while (output.length() < width - 1) {
            output += " ";
        }
        cout << output << "#" << endl;
    }
    printPoundLine(width);
}

Problem 3: ADTs (15 points)

Google Translate is powered, in large part, by a technique called word2vec that builds an understanding of a language by looking at the context in which each word occurs.

Imagine you have a word like “well.” There are certain words that might reasonably appear immediately before the word “well” in a sentence, like “feeling,” “going,” “reads,” etc., and some words that that are highly unlikely to appear before “well,” like “cake,” “circumspection,” and “election.” The idea behind word2vec is to find connections between words by looking for pairs of words that have similar sets of words preceding them. Those words likely have some kind of connection between them, and the rest of the logic in word2vec works by trying to discover what those connections are.

Your task is to write a function

Map<string, Set<string>> predecessorMap(Vector<string>& input);

that takes as input a Vector of strings representing the lines of a file, then returns a Map<string, Set<string>> that associates each word in the file with all the words that appeared directly before that word. For example, given a Vector containing the lines of JFK’s famous quote:

“Ask not what your country can do for you;
ask what you can do for your country,”

that looks like this

{"“Ask not what your country can do for you;", "ask what you can do for your country,”"}

your function should return a Map with these key/value pairs:

"not" : { "ask" } 
"what" : { "not", "ask" } 
"your" : { "what", "for" } 
"country" : { "your" } 
"can" : { "country", "you" } 
"do" : { "can" } 
"for" : { "do" } 
"you" : { "for", "what" }
"ask" : { "you" }

Notice that although the word “ask” appears twice in the quote, the first time it appears it’s the first word in the file and so nothing precedes it. The second time it appears, it’s preceded by a word from a different line, but before that is the word “you,” which is what ultimately appears in the output map.

Some notes on this problem:

You can assume you have access to a function

Vector<string> tokenize(string line);

that takes in a line, accurately tokenizes it into cleaned words (stripped of all punctuation, but with the original casing retained), and returns all of the words in the line as a Vector of strings.

If a word in a line contains no letters, it gets stripped to the empty string by the aformentioned tokenize function. Your map-building process should completely ignore these tokens.
Your code should be case-insensitive, so it should return the same result regardless of the capitalization of the words in the file. The capitalization of the keys in the map is completely up to you.
It is not guaranteed that the file has any words in it, and there’s no upper bound on how big the file can be.

Fill in the function below:

Vector<string> tokenize(string line); // Provided to you

Map<string, Set<string>> predecessorMap(Vector<string>& input) {
    /* FILL ME IN */
}

Solution

Map<string, Set<string>> predecessorMap(Vector<string>& input) {
    Map<string, Set<string>> result;
    string lastWord;  // Most-recently-read string, initially empty as a sentinel.
    
    /* Loop over all lines */
    for (string line: input) {
        Vector<string> tokens = tokenize(line);
        for (string token: tokens) {
            if (token != "") {
                token = toLowerCase(token);
                if (lastWord != "") {
                    result[token].add(lastWord);
                }
                lastWord = token;
            }
        }
    }
    return result;
}

Problem 4: Recursion (10 points)

Note: This problem assumes no prior knowledge of Karel the Robot.

Many of you may have encountered Karel the Robot before, but for those of you that haven't Karel is a robot that lives in a world composed of horizontal streets and vertical avenues laid out in a regular rectangular grid that looks like this:

karel in its world

As pictured, Karel is sitting on the intersection of 2nd Street and 3rd Avenue and wants to get back to the origin at 1st Street and 1st Avenue taking the fewest number of steps. There are three routes tied for the shortest possible path, and these are marked in the diagram below (red, green, and blue):

multiple paths Karel can take home

Write the recursive function

int countPaths(int street, int avenue)

that returns the number of paths Karel could take back to the origin from the specified starting position, subject to the condition that Karel always wants to take one of the shortest possible routes and can therefore only move west or south (i.e., left or down in the diagram) in each step.

Solution

int countPaths(int street, int avenue)
{
    // base case: if at western/southern wall, only one path to home from there
    if (street == 1 || avenue == 1) {
        return 1; 
    }
    // recursive case: move either south or west
    return countPaths(street - 1, avenue) + countPaths(street, avenue - 1);
}

Problem 5: Recursive Backtracking (Bonus, Extra Credit, 5 points)

Write a recursive function named divide that uses backtracking to divide the letters from word w into two words s1 and s2. Each letter from w must be used exactly once either in s1 or s2, and the relative order of the letters must be preserved, meaning that s1 and s2 must both be subsequences of w.

For example, the word friendly can be divided into find + rely. This solution is valid because s1 and s2 together contain all of the letters from w, the letters in s1 and s2 appear in the same relative order as they did in w, and both s1 and s2 are valid words in the English dictionary.

For some words, no valid division is possible. For example, consider dividing the word stream. It would not be valid to divide into star + me (relative order of letters not preserved), nor stem + ram (reuses 'm'), nor stem + a ('r' not used), nor seam + tr (not in the dictionary).

You will be passed two parameters: a reference to a Lexicon of dictionary words, and the string w to divide. Use the Lexicon passed to ensure that the divided words are both found in the dictionary. If the function finds a valid division, it prints the two words and searches no further. If no valid division is found, the function prints nothing.

The following table lists some example calls to your function and possible console output, given a Lexicon called dict. Rows showing blank output indicate calls that were unable to find any successful division of the given word.

Call	Output
`divide(dict, "friendly")`	`find + rely`
`divide(dict, "standing")`	`sting + and`
`divide(dict, "midterm")`	`mid + term`
`divide(dict, "recuperate")`	`cure + repeat`
`divide(dict, "stream")`
`divide(dict, "atomic")`
`divide(dict, "czar")`
`divide(dict, "xyz")`

If there are multiple valid divisions for word w, you may print any of them. Regardless of which division you print, your code must stop after finding a single solution; do not find/print all possible solutions.

Efficiency: While your code is not required to have any particular Big-Oh, you may lose points if your code is extremely inefficient or repeatedly re-explores the same path multiple times. In particular, your recursive exploration should not continue exploring a path where the prefix of the word you are examining cannot possibly make a real English word. Use the Lexicon to help you prune such unwanted searches. Pass all data structures by reference to avoid making copies.

Helpers: As with any coding problem on this exam, you may write helper functions as needed to help solve the problem.

Constraints: For full credit, obey the following restrictions. A solution that disobeys them can get partial credit.

Do not declare any global variables.
Your overall algorithm must be recursive and must use backtracking techniques to generate the results.

Solution

void divide(Lexicon& lex, string input) {
    divideHelper(lex, input, "", "");
}

bool divideHelper(Lexicon& lex, string input, string left, string right) {
    if (input.empty()) {
        if (lex.contains(left) && lex.contains(right)) {
            cout << left << " + " << right << endl;
            return true;
        } else {
            return false;
        }
    } else if (!lex.containsPrefix(left) || !lex.containsPrefix(right)) {
        return false;
    } else {
        char ch = input[0];
        string rest = input.substr(1);
        if (divideHelper(lex, rest, left + ch, right)
                || divideHelper(lex, rest, left, right + ch)) {
            return true;
        }
    }
    return false;
}