Binary merge
The binary merge operation is a very common operation in computer science, where two individual sorted sequences are merged into one larger sorted sequence. Merging is the backbone of many different sorting tasks (some of which we will explore later in the course), making this operation useful in a broad array of applications.
Implementing merge
Your first task is to write the function
Queue<int> merge(Queue<int> one, Queue<int> two)
that performs an iterative binary merge.
- The elements of Queues
one
andtwo
are expected to be in increasing order from front to back. The returned result is a Queue containing the combined elements fromone
andtwo
in increasing order from front to back. - The Queues
one
andtwo
are passed by value, somerge
receives copies of the input queues and can freely modify them. There is no requirement on the ending state of those queue copies. - The two Queues are not required to be the same length. One could be be enormous; the other could be empty. Be sure your function handles all possibilities.
- There is no special-case handling of duplicates. Merging
{1, 2, 2}
with{1, 3}
results in{1, 1, 2, 2, 3}
. - A slapdash copy/paste approach to writing merge can result in messy and repetitive code. Instead, work to structure your code to unify the common parts rather than repeat yourself. There is a tidy and compact solution that is quite lovely and we know you can achieve it! As a hint: think about breaking the task down into two subtasks: choosing which element to handle next and then processing it.
- Important: you must implement binary merge using iteration, not recursion.
- Although it is possible and even quite elegant to solve recursively, the cost of one stack frame per element being merged is much too high to bear and would cause the function to be unable to merge larger sequences. The limit on the maximum length sequence would depend on the callstack. Review the information you gathered in the warmup and answer the following questions in
short_answer.txt
:
- Although it is possible and even quite elegant to solve recursively, the cost of one stack frame per element being merged is much too high to bear and would cause the function to be unable to merge larger sequences. The limit on the maximum length sequence would depend on the callstack. Review the information you gathered in the warmup and answer the following questions in
Q9. Give a rough estimate of the maximum length sequence that could be successfully merged on your system assuming a recursive implementation of binary merge.
Q10. What would be the observed behavior if attempting to recursively merge a sequence larger than that maximum?
Merge preconditions
The binary merge
function states that the two input queues are expected to be in sorted order. However we've seen that blithely assuming all inputs confirm to the precondition can lead to trouble (such as the warmup when assuming the factorial
of would be non-negative). Rather than make assumptions, a better practice is to verify the precondition and raise an error if violated.
You are to add validation of the precondition to your binary merge
function. Rather than assume the input queues are in sorted order, you should add code to check the order. If an out-of-order element is detected, call the error
function with a descriptive message to report the problem.
Below we propose two possible approaches to validating the precondition. You may implement either (or a variant of your own choosing):
- A straightforward approach is to confirm the order of the input queues in a separate pass before merging. Write a helper function that inspects the contents of a single queue and raises an error if an element is found out of order. Call the helper on each of the two input queues.
- This alternative is a little tricker, but does not require an extra pass, making it more efficient. In this approach, you confirm the sorted order while completing the merge. If you ever encounter an element that seems to be misplaced when building the merged queue, raise an error. If you choose this approach, you do not need a helper function, as the error checking will be built into the logic of the merge operation itself.
Be sure to thoroughly test merge
before moving on. You will use this function a lot in the rest of this portion of the assignment, and you want to put in the time to test your code now so that you can use it later with confidence. In particular, consider writing tests that pass in queues of differing lengths, and queues that are not originally in sorted order, to make sure your error handling code works as expected.
Analyzing binary merge
Recall the TIME_OPERATION
macro that we introduced in Assignment 1, which allows us to measure the amount of time it takes for a certain function call to execute. To refresh your memory, an example code snippet is shown below.
PROVIDED_TEST("Time operation vector sort")
{
Vector<int> v = {3, 7, 2, 45, 2, 6, 3, 56, 12};
TIME_OPERATION(v.size(), v.sort());
}
As before, the test results for this test will include the measured time of execution:
Time operation vector sort
Line 174 Time v.sort() (size = 9) completed in 0 secs
In Assignment 1, you timed the same operation over a range of different input sizes and then qualitatiely reasoned about the growth of the amount of work that is done relative to the input size. Now, equipped with your knowledge of formal algorithmic analysis tools like Big-O, you will be able to use these time trials to identify and prove the Big O of the real algorithms that you just implemented.
First, predict what you expect to be the Big O of your binary merge
function. Formally you should express your Big O in terms of n
, where n
represents the total number of elements in the final merged queue. Alternatively, you can think of n
as representing the sum of the number of terms in the two provided queues (one
and two
).
Next, use the timing operation to measure the execution time over 4 or more different sizes. Note that for small inputs, the timing data can be very noisy and for large inputs, it may require a lot of patience to wait for the tests to complete running. Experiment to find sizes in the "sweet spot" – large enough to be fairly stable but small enough to complete under a minute.
In short_answer.txt
:
Q11. Include the data from your execution timing and explain how is supports your Big O prediction for binary merge
.
Multiway merge
Binary merge always receives exactly two sorted sequences to merge. A multiway merge or k-way merge is an extension of binary merge. Instead of two input sequences, a multiway merge receives k
sorted sequences to merge into one sorted output.
We have provided in the starter code a version of multiMerge
that is built on top of your binary merge
. The function is also replicated below for convenience. It iterates over all of the input queues, repeatedly calling binary merge
to do a pairwise merge of each queue into the result.
Queue<int> multiMerge(Vector<Queue<int>>& all)
{
Queue<int> result;
for (Queue<int> q : all) {
result = merge(q, result);
}
return result;
}
Testing and analyzing multiMerge
- Assuming that you have correctly implemented binary
merge
, our providedmultiMerge
should work correctly. Run the provided tests, and confirm that the function works as expected. - The function may be asked to merge 0 queues (represented by an empty
Vector
) or many empty queues (represented by aVector
of empty queues). Trace through the provided implementation and predict how it would handle such input. Add at least 2-3 test cases to confirm that the function behaves as expected in these scenarios. - Now, predict what you expect to be the Big O of the
multiMerge
function. Formally, you should express your Big O in terms of two quantities that can vary in size. The first isn
, wheren
represents the total number of elements in the final merged queue (alternatively, you can think of this as the total number of elements across all provided sequences that you've been asked to merge). The second isk
, wherek
represents the total number of distinct individual sequences that you are being asked to merge. Then, use the timing operation to measure the execution time over 5 or more different sizes ofn
(keepingk
fixed) and over 5 or more different sizes ofk
(keepingn
fixed). Inshort_answer.txt
:
Q12. Include the data from your execution timing and explain how it supports your Big O prediction for multiMerge
.
Divide and conquer to the rescue
The above implementation of multiMerge
really slows down for longer sequences, since the merge operation is repeatedly applied to longer and longer sequences.
Implementing recMultiMerge
Your final task is to write the function
Queue<int> recMultiMerge(Vector<Queue<int>>& all)
that applies the power of divide-and-conquer to implement a much more efficient variant of multiway merge that operates recursively.
The recursive strategy for recMultiMerge
follows these steps:
- Divide the vector of
k
sequences (queues) into two halves. The "left" half is the firstk/2
sequences (queues) in the vector, and the "right" half is the rest of the sequences (queues).- The Vector class has a helpful
subList
operation to subdivide a Vector.
- The Vector class has a helpful
- Make a recursive call to
recMultiMerge
on the "left" half of the sequences to generate one combined, sorted sequence. Then, do the same for the "right" half of the sequences to generate a second combined, sorted sequence. - Use your binary
merge
function to join the two combined sequences into the final result sequence, which is then returned.
In addition to this recursive strategy, we encourage you to think about what the base case(s) for this function will be.
In the diagram below, we drew the contents of each queue in a vertical yellow box. The initial collection has seven sequences. The dotted line shows the division into left and right halves of 3 and 4 sequences, respectively. Each half is recursively multiway merged. The combine step calls binary merge to produce the final result.
Analyzing recMultiMerge
You can prove that the runtime for recMultiMerge
is O(n log k)
where n
is the total number of elements over all sequences and k
is the count of sorted sequences at the start of the multiway merge.
What, exactly, does O(n log k)
mean? To start with, it might help to imagine that n
is allowed to vary, while k
stays constant. Thus, If you were to plot runtime vs. values of n
(assuming k
is kept constant), you’ll get a straight line with a slope that depends on k
. For example, if you have a small value of k
, then log k
is also small, and your line would have a small slope.
Now, let's think about how this plot would change if we started varying values of k
as well. If you were to increase the value of k
, the slope of the line from our original plot would increase a little bit, but not by much because log k
grows extremely slowly. In particular, if you look at the line for values of
k
that grow by a factor of 4 every time (say, k
= 1, then 4, then 16, then 64, then 256), the slope of the lines would increase by some small fixed rate (greater than 1 but much smaller than 4). This is the classic benefit of log k
behavior!
Given an input of n
elements that is entirely unsorted, we could first place each element in its own sequence (a singleton sequence is trivially sorted). We then call recMultiMerge
on that collection of n
sequences. This would make k
equal to n
(the maximal possible value for k
) and the entire operation would run in time O(n log n)
. The classic mergesort algorithm uses an approach similar to this since it assumes the input data is completely unsorted (more on that coming up when we discuss sorting and Big O). In our recMultiMerge
algorithm, the elements that are already arranged into sorted sequences just give us a jump start on the process. In other words, the smaller the value of k
, the more sorted our input data already is – work has already been done for us, and the whole process can complete more quickly!
To finish off this task:
- Write test cases that compare
recMultiMerge
to the non-recursivemultiMerge
. It should handle the same inputs in all the same ways, but finish the job a heck of a lot faster. - Use the timing operation to measure the execution time over 5 or more different sizes of
n
(keepingk
fixed) and over 5 or more different sizes ofk
(keepingn
fixed). Inshort_answer.txt
:
Q13. Include the data from your execution timing and explain how it supports the O(n log k)
runtime prediction for recMultiMerge.
Q14. Earlier you worked out how a recursive implementation of merge
would limit the allowable input sequence length. Are the limits on the call stack also an issue for the recursive implementation of recMultiMerge
? Explain why or why not.
Notes
- Your binary
merge
function should operate iteratively (not recursion). YourrecMultiMerge
should operate recursively (not iteration). - Do not edit the provided code for
multiMerge
. ThemultiMerge
function is used solely for testing and timing as a comparison to yourrecMultiMerge
. YourrecMultiMerge
should not callmultiMerge
; yourrecMultiMerge
should call your binarymerge
function. (That's a lot of merges! Be sure you know the role for each.) - The assignment requires you to write your own implementation of
merge
andrecMultiMerge
. Your code should not call the Vectorsort
method nor any other sort or merge from a library (whether C++ standard, Stanford, or otherwise).