Here is the note on logistics that would normally comprise the first page of the actual exam.
Intro Announcement
You may consult inanimate resources during the diagnostic but are prohibited from communicating with any other humans except the course staff. This means, for example, that it is permissible to look in the textbook, visit the course website, to open Qt Creator and run code, and review your own code. However, you are not allowed to discuss with others through any channel, nor are you permitted to post the questions anywhere where other humans could see them or offer advice or suggestions (for example, by posting on a public Q&A website or emailing questions to other people). Note: all diagnostic submissions will be examined for similarity using the same tool we apply to assignments.
Unless otherwise indicated as part of the instructions for a specific problem, comments will not be required. Uncommented code that gets the job done will be sufficient for full credit. On the other hand, comments may help you get partial credit if they help us determine what you were trying to do. You do not need to worry about efficiency unless a problem states otherwise. You donât need to prototype helper functions you write, and you donât need to explicitly #include libraries you want to use.
This exam is "self-contained" in the sense that if youâre expected to reference code from the lectures, textbook, assignments, or section handouts, weâll explicitly mention what that code is and provide sufficient context for you to use it.
Problem 1: Interpreting Code
Part A: Analyzing Code Output and Runtime
This question is designed to let you demonstrate what youâve learned about algorithmic analysis and data structure design. Below are four functions. We picked one of those functions and ran it on many different values of n. We captured the output of that function on each of those inputs, along with the runtime. That information is printed in the table below.
Function 1
int function1(int n) {
Stack<int> values;
for (int i = 0; i < n; i++) {
values.push(i);
}
int result;
while (!values.isEmpty()) {
result = values.pop();
}
return result;
}
Function 2
int function2(int n) {
Queue<int> values;
for (int i = 0; i < n; i++) {
values.enqueue(i);
}
int result;
while (!values.isEmpty()) {
result = values.dequeue();
}
return result;
}
Function 3
int function3(int n) {
Set<int> values;
for (int i = 0; i < n; i++) {
values.add(i);
}
int result;
for (int value: values) {
result = value;
}
return result;
}
Function 4
int function4(int n) {
Vector<int> values;
for (int i = 0; i < n; i++) {
values.add(i);
}
int result;
while (!values.isEmpty()) {
result = values[0];
values.remove(0);
}
return result;
}
n |
Time | Return Value |
|---|---|---|
| 100,000 | 0.137s | 99999 |
| 200,000 | 0.274s | 199999 |
| 300,000 | 0.511s | 299999 |
| 400,000 | 0.549s | 399999 |
| 500,000 | 0.786s | 499999 |
| 600,000 | 0.923s | 599999 |
| 700,000 | 0.960s | 699999 |
| 800,000 | 1.198s | 799999 |
| 900,000 | 1.335s | 899999 |
| 1,000,000 | 1.472s | 999999 |
- For each of these pieces of code, tell us its big-O runtime as a function of
n. No justification is required. - For each of these pieces of code, tell us whether that function could have given rise to the return values reported in the rightmost column of the table below. No justification is required.
- Which piece of code did we run? How do you know? Justify your answer in at most fifty words.
- The runtimes are as follows:
function1runs in timeO(n).function2runs in timeO(n).function3runs in timeO(n log n).function4runs in timeO(n^2).
Some explanations:
In function1, we do n pushes followed by n pops. The cost of each stack operation is O(1), so this means weâre doing 2n operations at an effective cost of O(1) each, for a net total of O(n). The same is true about queue operations: each one takes time O(1), which is why function2 takes time O(n) as well.
For function3, inserting an element into a set takes time O(log n). This means that the cost of inserting n elements is O(n log n). The cost of iterating over the set is also O(n) so the net runtime is O(n log n).
For function4, adding n elements to the end of a vector takes time O(n). However, removing from the front of a vector with n elements also takes time O(n), since we have to shift all the other elements back one position. This means that the overall runtime is O(n^2).
- The answers:
function1cannot produce the given output.function2will always produce this output.function3will always produce this output.function4will always produce this output.
Some explanations:
In function1, since elements are stored in a stack, the last element popped is the first element pushed, which would always be zero. Therefore, weâd expect to see a column of zeros in the table, which doesnât match whatâs actually there.
In function2, the last element removed from the queue is the last element added to the queue, which, here, would be n â 1, matching the output.
In function3, the Set type stores its elements in sorted order. Iterating over the set, therefore, will visit the elements in ascending order, so the last element iterated over by the loop would be n â 1, matching the output.
Finally, in function4, we remove elements from the vector in the reverse order in which theyâre added, matching the queueâs ordering and making the last element visited exactly n â 1.
- First, notice that the runtime appears to be O(n); doubling the size of the inputs roughly doubles the runtime. That leaves
function1andfunction2as choices, andfunction1has the wrong return value. Therefore, we must have runfunction2.
Part B: Tracing a Recursive Function
For each of the calls to the following recursive function below, indicate what output is produced:
void recursionMystery2(int n) {
if (n <= 1) {
cout << ": ";
} else {
cout << (n % 2) << " ";
recursionMystery2(n / 2);
cout << n << " ";
}
}
| Call | Output |
|---|---|
a) recursionMystery2(8); |
 |
b) recursionMystery2(25); |
 |
c) recursionMystery2(46); |
 |
| Call | Output |
|---|---|
a) recursionMystery2(8); |
0 0 0 : 2 4 8 |
b) recursionMystery2(25); |
1 0 0 1 : 3 6 12 25 |
c) recursionMystery2(46); |
0 1 1 1 0 : 2 5 11 23 46 |
Problem 2: C++ Fundamentals
Write a function named printBox that accepts two parameters: a Vector<string> holding the lines of a file, and an int for a width. Your function should go through the lines and display them to the console with a 'box' border of â#â characters around them on all four sides. The second parameter to your function indicates the total width of the box including its border. You must also convert each line to "title case" by capitalizing the first letter of the line and lowercasing all subsequent letters. For example, suppose the file poem.txt contains the following text:
Your skin like dawn
Mine like musk
One paints the beginning
of a certain end.
The other, the end of a
sure beginning.
-Maya Angelou
Notice that the file might contain blank lines, which will be represented by the empty string. The passed in Vector of lines would look like this:
{"Your skin like dawn", "Mine like musk", "", "One paints the beginning", "of a certain end", "", "The other, the end of a", "sure beginning.", "", "-Maya Angelou"}
The following would be the outputs from two calls to your function:
printBox(lines, 26)
##########################
#Your skin like dawn #
#Mine like musk #
# #
#One paints the beginning#
#Of a certain end. #
# #
#The other, the end of a #
#Sure beginning. #
# #
#-maya angelou #
##########################
printBox(lines, 35)
###################################
#Your skin like dawn #
#Mine like musk #
# #
#One paints the beginning #
#Of a certain end. #
# #
#The other, the end of a #
#Sure beginning. #
# #
#-maya angelou #
###################################
You may assume that the width value passed is non-negative and that no line in the file is too wide to fit into a box of that width.
Fill in the function below:
void printBox(Vector<string> lines, int width) {
/* FILL ME IN */
}
void printPoundLine(int width) {
for (int i = 0; i < width; i++) {
cout << "#";
}
cout << endl;
}
void printBox(Vector<string> lines, int width) {
printPoundLine(width);
for (string line: lines) {
string output = "#";
if (line != ""){
output = output + toUpperCase(line.substr(0, 1)) + toLowerCase(line.substr(1));
}
while (output.length() < width - 1) {
output += " ";
}
cout << output << "#" << endl;
}
printPoundLine(width);
}
Problem 3: ADTs
Google Translate is powered, in large part, by a technique called word2vec that builds an understanding of a language by looking at the context in which each word occurs.
Imagine you have a word like "well." There are certain words that might reasonably appear immediately before the word "well" in a sentence, like "feeling," "going," "reads," etc., and some words that that are highly unlikely to appear before "well," like "cake," "circumspection," and "election." The idea behind word2vec is to find connections between words by looking for pairs of words that have similar sets of words preceding them. Those words likely have some kind of connection between them, and the rest of the logic in word2vec works by trying to discover what those connections are.
Your task is to write a function
Map<string, Set<string>> predecessorMap(Vector<string>& input);
that takes as input a Vector of strings representing the lines of a file, then returns a Map<string, Set<string>> that associates each word in the file with all the words that appeared directly before that word. For example, given a Vector containing the lines of JFKâs famous quote:
"Ask not what your country can do for you;
ask what you can do for your country,"
that looks like this
{""Ask not what your country can do for you;", "ask what you can do for your country,""}
your function should return a Map with these key/value pairs:
"not" : { "ask" }
"what" : { "not", "ask" }
"your" : { "what", "for" }
"country" : { "your" }
"can" : { "country", "you" }
"do" : { "can" }
"for" : { "do" }
"you" : { "for", "what" }
"ask" : { "you" }
Notice that although the word "ask" appears twice in the quote, the first time it appears itâs the first word in the file and so nothing precedes it. The second time it appears, itâs preceded by a word from a different line, but before that is the word "you," which is what ultimately appears in the output map.
Some notes on this problem:
- You can assume you have access to a function
Vector<string> tokenize(string line);
that takes in a line, accurately tokenizes it into cleaned words (stripped of all punctuation, but with the original casing retained), and returns all of the words in the line as a Vector of strings.
- If a word in a line contains no letters, it gets stripped to the empty string by the aformentioned
tokenizefunction. Your map-building process should completely ignore these tokens. - Your code should be case-insensitive, so it should return the same result regardless of the capitalization of the words in the file. The capitalization of the keys in the map is completely up to you.
- It is not guaranteed that the file has any words in it, and thereâs no upper bound on how big the file can be.
Fill in the function below:
Vector<string> tokenize(string line); // Provided to you
Map<string, Set<string>> predecessorMap(Vector<string>& input) {
/* FILL ME IN */
}
Map<string, Set<string>> predecessorMap(Vector<string>& input) {
Map<string, Set<string>> result;
string lastWord; // Most-recently-read string, initially empty as a sentinel.
/* Loop over all lines */
for (string line: input) {
Vector<string> tokens = tokenize(line);
for (string token: tokens) {
if (token != "") {
token = toLowerCase(token);
if (lastWord != "") {
result[token].add(lastWord);
}
lastWord = token;
}
}
}
return result;
}
Problem 4: Recursion
Chapter exercise 7.7 from the textbook
Write a recursive function digitSum(n) that takes a nonnegative integer and
returns the sum of its digits. For example, calling digitSum(1729) should
return 1 + 7 + 2 + 9, which is 19.
The recursive implementation of digitSum depends on the fact that it is
very easy to break an integer down into two components using division by 10.
For example, given the integer 1729, you can divide it into two pieces as
follows:
1729
/ \
127 9
Each of the resulting integers is strictly smaller than the original and thus represents a simpler case.
int digitSum(int n)
{
if (n < 10) {
return n;
} else {
return digitSum(n / 10) + digitSum(n % 10);
}
}
Problem 5: Recursive Backtracking
Write a recursive function named divide that uses backtracking to divide the letters from word w into two words s1
and s2. Each letter from w must be used exactly once either in s1 or s2, and the relative order of the letters must be
preserved, meaning that s1 and s2 must both be subsequences of w.
For example, the word friendly can be divided into find + rely. This solution is valid because s1 and s2 together
contain all of the letters from w, the letters in s1 and s2 appear in the same relative order as they did in w, and both s1 and
s2 are valid words in the English dictionary.
For some words, no valid division is possible. For example, consider dividing the word stream. It would not be valid to
divide into star + me (relative order of letters not preserved), nor stem + ram (reuses 'm'), nor stem + a ('r' not
used), nor seam + tr (not in the dictionary).
You will be passed two parameters: a reference to a Lexicon of dictionary words, and the string w to divide. Use the
Lexicon passed to ensure that the divided words are both found in the dictionary. If the function finds a valid division, it
prints the two words and searches no further. If no valid division is found, the function prints nothing.
The following table lists some example calls to your function and possible console output, given a Lexicon called dict.
Rows showing blank output indicate calls that were unable to find any successful division of the given word.
| Call | Output |
|---|---|
divide(dict, "friendly") |
find + rely |
divide(dict, "standing") |
sting + and |
divide(dict, "midterm") |
mid + term |
divide(dict, "recuperate") |
cure + repeat |
divide(dict, "stream") |
 |
divide(dict, "atomic") |
 |
divide(dict, "czar") |
 |
divide(dict, "xyz") |
 |
If there are multiple valid divisions for word w, you may print any of them. Regardless of which division you print, your
code must stop after finding a single solution; do not find/print all possible solutions.
Efficiency: While your code is not required to have any particular Big-Oh, you may lose points if your code is extremely
inefficient or repeatedly re-explores the same path multiple times. In particular, your recursive exploration should not
continue exploring a path where the prefix of the word you are examining cannot possibly make a real English word. Use
the Lexicon to help you prune such unwanted searches. Pass all data structures by reference to avoid making copies.
Helpers: As with any coding problem on this exam, you may write helper functions as needed to help solve the problem.
Constraints: For full credit, obey the following restrictions. A solution that disobeys them can get partial credit.
- Do not declare any global variables.
- Your overall algorithm must be recursive and must use backtracking techniques to generate the results.
void divide(Lexicon& lex, string input) {
divideHelper(lex, input, "", "");
}
bool divideHelper(Lexicon& lex, string input, string left, string right) {
if (input.empty()) {
if (lex.contains(left) && lex.contains(right)) {
cout << left << " + " << right << endl;
return true;
} else {
return false;
}
} else if (!lex.containsPrefix(left) || !lex.containsPrefix(right)) {
return false;
} else {
char ch = input[0];
string rest = input.substr(1);
return (divideHelper(lex, rest, left + ch, right)
|| divideHelper(lex, rest, left, right + ch));
}
}