Practice Midterm

Although the actual midterm will be provided as a PDF and files will be submitted on Gradescope, the problems below otherwise are representative of the structure and length to what you can expect on the actual midterm. We recommend trying these in a timed setting and using the code .zip file below to simulate a similar environment as the actual exam before looking at the solutions.

DOWNLOAD TO START: Code zip file (similar to what you will be provided on the exam)

Instructions

Please note that these instructions do not apply to this practice exam, but we are replicating the instructions that you can expect to see on the actual exam.

Gradescope instructions

As soon as you clicked on the “Start” button for the diagnostic, Gradescope launched a timer that indicates the period within which you need to submit your answers. For each question on this diagnostic, you received a .cpp file in the downloaded starter code (linked on Gradescope along with this PDF). You should open the Diagnostic.pro file in Qt Creator, since you will be writing all of your code and short answers in Qt Creator. To allow you to focus on problem-solving rather than compiling or debugging your code, we have removed the components of the starter code that would make the programs runnable in Qt Creator. So please do not hit the run button in Qt Creator from within the provided starter code – if you do, you will encounter a lot of compiler errors.

Once you are finished, you’ll be able to upload each file that corresponds to the specific diagnostic question on Gradescope (don't close the tab that you initially opened). Please double-check that you saved your code and short answers and that you turned in the correct .cpp file for each question. You will be able to resubmit any files up until the timer expires.

Honor Code instructions

Make sure to read and understand the specifications on allowable help resources described below – we expect all students to abide by the Honor Code when taking the midterm.

This midterm is open-book, open-notes, and open-course website – this includes the Stanford Library documentation and the Ed discussion forum. You are not allowed to communicate with any other humans during the midterm except the course staff. This means, for example, that it is perfectly permissible to visit the course website and to use Qt Creator. However, you are not allowed to communicate with other humans in any way during the midterm, nor are you permitted to post the questions anywhere where other humans besides the course staff could see them or offer advice or suggestions on them (for example, by posting publicly on Ed or emailing questions to other people). We will run your midterm answers through similarity detection software to ensure you have not received impermissible help on the assessment.

Unless otherwise indicated as part of the instructions for a specific problem, comments will not be required on the midterm. Uncommented code that gets the job done will be sufficient for full credit on the problem. On the other hand, pseudocode and comments may help you get partial credit if they help us determine what you were trying to do. You do not need to worry about efficiency unless a problem states otherwise. You don’t need to prototype helper functions you write, and you don’t need to explicitly #include libraries you want to use.

This midterm is “self-contained” in the sense that if you’re expected to reference code from the lectures, textbook, assignments, or section handouts, we’ll explicitly mention what that code is and provide sufficient context for you to use it.

Problem 1: Interpreting Code

Please write your answers for this problem in 1-interpreting.cpp.

Part A: Analyzing Code Output and Runtime

This question is designed to let you demonstrate what you’ve learned about algorithmic analysis and data structure design. Below are four functions. We picked one of those functions and ran it on many different values of n. We captured the output of that function on each of those inputs, along with the runtime. That information is printed in the table below.

Function 1

int function1(int n) { 
    Stack<int> values; 
    for (int i = 0; i < n; i++) { 
        values.push(i); 
    }
    int result; 
    while (!values.isEmpty()) { 
        result = values.pop(); 
    } 
    return result; 
} 

Function 2

int function2(int n) {
    Queue<int> values;
    for (int i = 0; i < n; i++) { 
        values.enqueue(i); 
    } 
    int result; 
    while (!values.isEmpty()) { 
        result = values.dequeue(); 
    } 
    return result; 
} 

Function 3

int function3(int n) { 
    Set<int> values; 
    for (int i = 0; i < n; i++) { 
        values.add(i); 
    } 
    int result;
    for (int value: values) { 
        result = value; 
    }
    return result; 
} 

Function 4

int function4(int n) { 
    Vector<int> values; 
    for (int i = 0; i < n; i++) { 
        values.add(i); 
    }
    int result; 
    while (!values.isEmpty()) { 
        result = values[0]; 
        values.remove(0); 
    }
    return result; 
} 

`n`	Time	Return Value
100,000	0.137s	99999
200,000	0.274s	199999
300,000	0.511s	299999
400,000	0.549s	399999
500,000	0.786s	499999
600,000	0.923s	599999
700,000	0.960s	699999
800,000	1.198s	799999
900,000	1.335s	899999
1,000,000	1.472s	999999

For each of these pieces of code, tell us its big-O runtime as a function of n. No justification is required.
For each of these pieces of code, tell us whether that function could have given rise to the return values reported in the rightmost column of the table below. No justification is required.
Which piece of code did we run? How do you know? Justify your answer in at most fifty words.

Solution

The runtimes are as follows:
- function1 runs in time O(n).
- function2 runs in time O(n).
- function3 runs in time O(n log n).
- function4 runs in time O(n^2).

Some explanations: In function1, we do n pushes followed by n pops. The cost of each stack operation is O(1), so this means we’re doing 2n operations at an effective cost of O(1) each, for a net total of O(n). The same is true about queue operations: each one takes time O(1), which is why function2 takes time O(n) as well.

For function3, inserting an element into a set takes time O(log n). This means that the cost of inserting n elements is O(n log n). The cost of iterating over the set is also O(n) so the net runtime is O(n log n).

For function4, adding n elements to the end of a vector takes time O(n). However, removing from the front of a vector with n elements also takes time O(n), since we have to shift all the other elements back one position. This means that the overall runtime is O(n^2).

The answers: function1 cannot produce the given output. function2 will always produce this output. function3 will always produce this output. function4 will always produce this output.

Some explanations: In function1, since elements are stored in a stack, the last element popped is the first element pushed, which would always be zero. Therefore, we’d expect to see a column of zeros in the table, which doesn’t match what’s actually there.

In function2, the last element removed from the queue is the last element added to the queue, which, here, would be n – 1, matching the output.

In function3, the Set type stores its elements in sorted order. Iterating over the set, therefore, will visit the elements in ascending order, so the last element iterated over by the loop would be n – 1, matching the output.

Finally, in function4, we remove elements from the vector in the reverse order in which they’re added, matching the queue’s ordering and making the last element visited exactly n – 1.

First, notice that the runtime appears to be O(n); doubling the size of the inputs roughly doubles the runtime. That leaves function1 and function2 as choices, and function1 has the wrong return value. Therefore, we must have run function2.

Part B: Tracing a Recursive Function

For each of the calls to the following recursive function below, indicate what output is produced:

void recursionMystery2(int n) { 
    if (n <= 1) { 
        cout << ": ";
    } else {
        cout << (n % 2) << " "; 
        recursionMystery2(n / 2); 
        cout << n << " "; 
    } 
}

Call	Output
a) `recursionMystery2(8);`
b) `recursionMystery2(25);`
c) `recursionMystery2(46);`

Solution

Call	Output
a) `recursionMystery2(8);`	`0 0 0 : 2 4 8`
b) `recursionMystery2(25);`	`1 0 0 1 : 3 6 12 25`
c) `recursionMystery2(46);`	`0 1 1 1 0 : 2 5 11 23 46`

Problem 2: ADTs

Please write your answers for this problem in 2-adts.cpp.

Google Translate is powered, in large part, by a technique called word2vec that builds an understanding of a language by looking at the context in which each word occurs.

Imagine you have a word like "well." There are certain words that might reasonably appear immediately before the word "well" in a sentence, like "feeling," "going," "reads," etc., and some words that that are highly unlikely to appear before "well," like "cake," "circumspection," and "election." The idea behind word2vec is to find connections between words by looking for pairs of words that have similar sets of words preceding them. Those words likely have some kind of connection between them, and the rest of the logic in word2vec works by trying to discover what those connections are.

Your task is to write a function

Map<string, Set<string>> predecessorMap(Vector<string>& input);

that takes as input a Vector of strings representing the lines of a file, then returns a Map<string, Set<string>> that associates each word in the file with all the words that appeared directly before that word. For example, given a Vector containing the lines of JFK’s famous quote:

"Ask not what your country can do for you;
ask what you can do for your country,"

that looks like this

{""Ask not what your country can do for you;", "ask what you can do for your country,""}

your function should return a Map with these key/value pairs:

"not" : { "ask" } 
"what" : { "not", "ask" } 
"your" : { "what", "for" } 
"country" : { "your" } 
"can" : { "country", "you" } 
"do" : { "can" } 
"for" : { "do" } 
"you" : { "for", "what" }
"ask" : { "you" }

Notice that although the word "ask" appears twice in the quote, the first time it appears it’s the first word in the file and so nothing precedes it. The second time it appears, it’s preceded by a word from a different line, but before that is the word "you," which is what ultimately appears in the output map.

Some notes on this problem:

You can assume you have access to a function

Vector<string> tokenize(string line);

that takes in a line, accurately tokenizes it into cleaned words (stripped of all punctuation, but with the original casing retained), and returns all of the words in the line as a Vector of strings.

If a word in a line contains no letters, it gets stripped to the empty string by the aformentioned tokenize function. Your map-building process should completely ignore these tokens.
Your code should be case-insensitive, so it should return the same result regardless of the capitalization of the words in the file. The capitalization of the keys in the map is completely up to you.
It is not guaranteed that the file has any words in it, and there’s no upper bound on how big the file can be.

Fill in the function below:

Vector<string> tokenize(string line); // Provided to you

Map<string, Set<string>> predecessorMap(Vector<string>& input) {
    /* FILL ME IN */
}

Solution

Map<string, Set<string>> predecessorMap(Vector<string>& input) {
    Map<string, Set<string>> result;
    string lastWord;  // Most-recently-read string, initially empty as a sentinel.
    
    /* Loop over all lines */
    for (string line: input) {
        Vector<string> tokens = tokenize(line);
        for (string token: tokens) {
            if (token != "") {
                token = toLowerCase(token);
                if (lastWord != "") {
                    result[token].add(lastWord);
                }
                lastWord = token;
            }
        }
    }
    return result;
}

Problem 3: Recursion

Chapter exercise 7.7 from the textbook

Please write your answers for this problem in 3-recursion.cpp.

Write a recursive function digitSum(n) that takes a nonnegative integer and returns the sum of its digits. For example, calling digitSum(1729) should return 1 + 7 + 2 + 9, which is 19.

The recursive implementation of digitSum depends on the fact that it is very easy to break an integer down into two components using division by 10. For example, given the integer 1729, you can divide it into two pieces as follows:

                            1729
                             / \
                          127   9

Each of the resulting integers is strictly smaller than the original and thus represents a simpler case.

Solution

int digitSum(int n) {
    if (n < 10) {
        return n;
    } else {
        return digitSum(n / 10) + digitSum(n % 10);
    }
}

Problem 4: Backtracking Recursion (Optional Extra Credit)

Note: This problem is an example of a problem that might appear as an optional extra credit problem on the midterm. This problem requires the material on recursive backtracking covered in the Thursday, July 8 lecture.

Please write your answers for this problem in 4-backtracking.cpp.

Write a recursive function createDieSequence(int numRolls) that takes a nonnegative integer numRolls and prints out all of the possible results you can get when you roll a 6-sided die numRolls times.

For example, calling createDieSequence(1) would produce simply:

Calling createDieSequence(2) should produce the following console output:

The first line represents rolling two 1s in a row, the second line represents rolling a 1 and then a 2, and so on and so forth. Since we care about the order in which the numbers are rolled, note that we print out both 12 and 21.

With the help of your code, someone could more easily determine the probability of particular combinations of die rolls in games of chance!

Helpers: As with any coding problem on this diagnostic, you may write helper functions as needed to help solve the problem.

Constraints: For full credit, obey the following restrictions. A solution that disobeys them can get partial credit.

Do not declare any global variables.
Your overall algorithm must be recursive and must use backtracking techniques to generate the results.

Solution

void createDieSequenceHelper(int numRolls, string sequence) {
    if (numRolls == 0) {
        cout << sequence << endl;
    } else {
        for (char i = '1'; i < '7'; i++) {
            createDieSequenceHelper(numRolls - 1, sequence + i);
        }
    }
}

void createDieSequence(int numRolls) {
    createDieSequenceHelper(numRolls, "");
}