Assign3: Merge

Binary merge

The binary merge operation is a very common operation in computer science, where two individual sorted sequences are merged into one larger sorted sequence. This operation serves as the backbone of many different sorting algorithms (some of which we will explore later in the course), which themselves are used in a broad diversity of programs and applications. In that way, the binary merge is the hardworking worker bee at the core of many interesting algorithms and applications!

Your first task is to write the function

Queue<int> merge(Queue<int> one, Queue<int> two)

that performs a binary merge.

The elements of Queues one and two are expected to be in increasing order from front to back. The returned result is a Queue containing all elements from one and two in increasing order from front to back.
The two Queues are not required to be the same length. One could be be enormous; the other could be empty. Be sure your function handles all possibilities.
You must implement binary merge using iteration, not recursion.
- Although it is possible and even quite elegant to solve recursively, the cost of one stack frame per element being merged is much too high to bear and would cause the function to be unable to merge larger sequences. The limit on the maximum length sequence would depend on the callstack. Review the information you gathered in the Fundamental Recursion warmup and answer the following questions in short_answer.txt:
- Q9. Give a rough estimate of the maximum length sequence that could be successfully merged on your system assuming a recursive implementation of binary merge.
- Q10. What would be the observed behavior if attempting to merge a sequence larger than that maximum?
Although we expect the provided queues one and two to be in sorted order, we've learned in previous assignments that making assumptions about input into functions that we write can be very dangerous – remember the stack overflow that happened when passing in a negative number into the factorial function? For that reason, we are also going to require you to do some validation of the precondition assumption of this function, in which you will ensure that the provided queues are actually in sorted order. If at any point in processing the sequence you find an element which is out of order, you should raise an error with a descriptive message by calling the error function.
- There are two equally valid approaches to solving this problem, and you may choose to implement either one.
- The first approach is slightly more straightforward to think about, but a little less efficient. Before starting the meat of the operation, you can inspect the contents of each of the provided queues – if either one is out of order, raise an error. If you choose this approach, we would recommend writing a helper function to handle this inspection.
- The second approach is a little more tricky to think about, but does not require an extra pass over the queues, making it a little more efficient. In this approach, you can enforce the sorted order check while completing the merge. If you ever encounter an element that seems to be misplaced when building the merged queue, raise an error. If you choose this approach, you do not need a helper function, as the error checking will be built into the logic of the merge operation itself.
Be sure to thoroughly test merge before moving on. You will use this function a lot in the rest of this portion of the assignment, and you want to put in the time to test your code now so that you can use it later with confidence. In particular, consider writing tests that pass in queues of differing lengths, and queues that are not originally in sorted order, to make sure your error handling code works as expected.

Timing and Big O

Recall the TIME_OPERATION macro that we introduced in Assignment 1, which allows us to measure the amount of time it takes for a certain function call to execute. To refresh your memory, an example code snippet is shown below.

PROVIDED_TEST("Time operation vector sort")
{
    Vector<int> v = {3, 7, 2, 45, 2, 6, 3, 56, 12};
    TIME_OPERATION(v.size(), v.sort());
}

As before, the test results for this test will include the measured time of execution:

Time operation vector sort
    Line 174 Time v.sort() (size =       9) completed in        0 secs

In Assignment 1, you timed the same operation over a range of different input sizes and then qualitatiely reasoned about the growth of the amount of work that is done relative to the input size. Now, equipped with your knowledge of formal algorithmic analysis tools like Big-O, you will be able to use these time trials to identify and prove the Big O of the real algorithms that you just implemented.

First, predict what you expect to be the Big O of your merge function. Formally you should express your Big O in terms of n, where n represents the total number of elements in the final merged queue. Alternatively, you can think of n as representing the sum of the number of terms in the two provided queues (one and two).

Next, use the timing operation to measure the execution time over 4 or more different sizes. Note that for small inputs, the timing data can be very noisy and for large inputs, it may require a lot of patience to wait for the tests to complete running. Experiment to find sizes in the "sweet spot" – large enough to be fairly stable but small enough to complete under a minute.

In short_answer.txt:

Q11. Include the data from your execution timing and explain how is supports your Big O prediction for merge.

Multiway merge

A multiway merge or k-way merge is an extension of the binary merge operation, which receives k sorted sequences (queues) and merges the sequences into a single sorted sequence (queue).

We have provided in the starter code a version of multiMerge that is built on top of your binary merge. The function is also replicated below for convenience.

Queue<int> multiMerge(Vector<Queue<int>>& all)
{
    Queue<int> result;

    for (Queue<int> q : all) {
        result = merge(q, result);
    }
    return result;
}

Assuming that you have implemented the merge function from the previous step correctly, our provided code should work correctly. Run the provided tests, and confirm that the function works as expected.
The function may be asked to merge 0 queues (represented by an empty Vector) or many empty queues (represented by a Vector of empty queues). Read and trace through the provided implementation and predict how it would handle such input. Add at least 2-3 test cases to confirm that the function behaves as expected in these scenarios.
First, predict what you expect to be the Big O of the multiMerge function. Formally, you should express your Big O in terms of two quantities that can vary in size. The first is n, where n represents the total number of elements in the final merged queue (alternatively, you can think of this as the total number of elements across all provided sequences that you've been asked to merge). The second is k, where k represents the total number of distinct individual sequences that you are being asked to merge. Then, use the timing operation to measure the execution time over 5 or more different sizes of n (keeping k fixed) and over 5 or more different sizes of k (keeping n fixed). In short_answer.txt:
- Q12. Include the data from your execution timing and explain how it supports your Big O prediction for multiMerge.

Divide and conquer to the rescue

The above implementation of multiMerge really slows down for longer sequences, since the merge operation is repeatedly applied to longer and longer sequences.

Your goal now is to write the function

Queue<int> recMultiMerge(Vector<Queue<int>>& all)

that applies the power of recursive divide-and-conquer to achieve much improved performance for multiway merge.

The recursive component of the strategy for recMultiMerge has the following important steps:

Divide the vector of k sequences (queues) into two halves. The "left" half is the first k/2 sequences (queues) in the vector, and the "right" half is the rest of the sequences (queues).
- The Vector subList operation can be used to subdivide a Vector, which you may find helpful.
Make a recursive call to recMultiMerge on the "left" half of the sequences to generate one combined, sorted sequence. Then, do the same for the "right" half of the sequences, generating a second combined, sorted sequence.
Use your binary merge function to join the two combined sequences into the final result sequence, which is then returned.

In addition to this recursive strategy, we encourage you to think about what the base case(s) for this function will be.

In the diagram below, we drew the contents of each queue in a vertical yellow box. The initial collection has seven sequences. The dotted line shows the division into left and right halves of 3 and 4 sequences, respectively. Each half is recursively multiway merged. The combine step calls binary merge to produce the final result.

Example merge from bottom up

You can prove that the runtime for recMultiMerge is O(n log k) where n is the total number of elements over all sequences and k is the count of sorted sequences at the start of the multiway merge.

What, exactly, does O(n log k) mean? To start with, it might help to imagine that nis allowed to vary, while k stays constant. Thus, If you were to plot runtime vs. values of n (assuming k is kept constant), you’ll get a straight line with a slope that depends on k. For example, if you have a small value of k, then log k is also small, and your line would have a small slope.

Now, let's think about how this plot would change if we started varying values of k as well. If you were to increase the value of k, the slope of the line from our original plot would increase a little bit, but not by much because log k grows extremely slowly. In particular, if you look at the line for values of k that grow by a factor of 4 every time (say, k = 1, then 4, then 16, then 64, then 256), the slope of the lines would increase by some small fixed rate (greater than 1 but much smaller than 4). This is the classic benefit of log k behavior!

Given an input of n elements that is entirely unsorted, we could place each element in its own sequence and call recMultiMerge on that collection of n sequences. This would make k equal to n (the maximal possible value for k). The operation would run in time O(n log n). The classic mergesort algorithm uses an approach similar to this since it assumes the input data is completely unsorted (more on that coming up when we discuss sorting and Big O). In our recMultiMerge algorithm, the elements that are already arranged into sorted sequences just give us a jump start on the process. In other words, the smaller the value of k, the more sorted our input data already is – work has already been done for us, and the whole process can complete more quickly!

To finish off this task:

Write test cases that compare recMultiMerge to the non-recursive multiMerge. It should handle the same inputs in all the same ways, but finish the job a heck of a lot faster.
Use the timing operation to measure the execution time over 5 or more different sizes of n (keeping k fixed) and over 5 or more different sizes of k (keeping n fixed). In short_answer.txt:
- Q13. Include the data from your execution timing and explain how it supports the O(n log k) runtime prediction for recMultiMerge.
- Q14. Earlier you worked out how a recursive implementation of merge would limit the allowable input sequence length. Are the limits on the call stack also an issue for the recursive implementation of recMultiMerge? Explain why or why not.