Due Tuesday, May 27 at 1:00 pm Pacific
Solutions are available!
This assignment may be completed individually or with a partner.
This problem set is all about amortized efficiency and how to design powerful data structures that fit into that paradigm. In this problem set, youâll get to play around with the data structures we saw in lecture, plus a few others from earlier. By the time youâve finished this problem set, youâll have an excellent handle on how amortization works and how to think about problem-solving in a new way.
$\LaTeX$ Template
Assignments are required to be typed. We've provided a $\LaTeX$ template for this assignment, though you're not required to use it.
Problem One: Stacking the Deque
This problem involves writing C++ code. You can find the starter files at /usr/class/cs166/assignments/a5/. As usual, use cp -r to copy the files and /usr/class/cs166/bin/submit to submit them. Your code needs to compile without warnings (-Wall -Werror) and run cleanly under valgrind. And, of course, your code should be well-commented and so beautiful to behold that it brings tears to your eyes. Check the README for details about how to build and run everything.
A deque (double-ended queue, pronounced âdeckâ) is a data structure that acts as a hybrid between a stack and a queue. It represents a sequence of elements and supports the following six operations:
- deque
.push_front(x), which adds $x$ to the front of the sequence. - deque
.push_back(x), which adds $x$ to the back of the sequence. - deque
.front(), which returns (but does not remove) the front element of the sequence. - deque
.back(), which returns (but does not remove) the last element of the sequence. - deque
.pop_front(), which removes (but does not return) the front element of the sequence. - deque
.pop_back(), which removes (but does not return) the last element of the sequence.
Your goal in this problem is to design a deque using three stacks such that each operation runs in amortized time $O(1)\text,$ under the assumption that each operation on a stack takes time $O(1)\text.$ You may not use any auxiliary or helper data structures other than these stacks. Youâll do this in two steps. First, youâll design a data structure meeting these time bounds and code up your implementation. Next, youâll write up an analysis of the three-stack deque to explain why each operation takes amortized time $O(1)\text.$
-
Implement a three-stack deque in the files
ThreeStackDeque.h/.cpp. Your implementations offront,back,pop_front, andpop_backshould throw exceptions of typestd::out_of_rangeif the deque is empty.In C++, stacks are represented by the type
std::stack<T>, which is defined in the<stack>header. The operations on stacks that youâll need to use are- stack
.size(), which returns the size of the stack; - stack
.empty(), which returns whether the stack is empty; - stack
.push(x), which pushes onto the stack; - stack
.top(), which returns the top element of the stack; and - stack
.pop(), which pops the stack (but does not return the top element).
Note that running a program under
valgrindmarkedly slows it down, so donât worry if youâre failing the time tests when valgrind is engaged. - stack
-
Give a brief description of your data structure in plain English, the way youâd write up a solution to a non-coding question. Then, define a potential function $\Phi$ and use the potential method to argue that the amortized cost of each of the six operations on your three-stack deque is $O(1)\text.$ You do not need to argue correctness. Weâre expecting your amortized analysis to be written up in a manner similar to the formal analyses we did in lecture of the two-stack queue, dynamic array, and B-tree construction algorithm.
Problem Two: Very Dynamic Prefix Parity
Earlier this quarter, you designed a data structure for the dynamic prefix parity problem. This data structure supports the following operations:
initialize(n), which creates a new data structure representing an array of $n$ bits, all initially zero. This takes time $O(n)\text.$ds.flip(i), which flips the $i\text{th}$ bit of the bitstring. This takes time $O\left( \frac{\log n}{\log \log n} \right)\text.$ds.prefix-parity(i), which returns the parity of the subarray consisting of the first $i$ bits of the array. (The parity is 0 if there are an even number of one bits and 1 if there are an odd number of 1 bits). This has the same runtime as theflipoperation, $O\left( \frac{\log n}{\log \log n} \right)\text.$
Now, consider the very dynamic prefix parity problem (or VDDP). In this problem, you donât begin with a fixed array of bits, but instead start with an empty sequence. You then need to support these operations:
initialize(), which constructs a new, empty VDPP structure.- ds
.append(b), which appends bit $b$ to the bitstring. - ds
.flip(i), which flips the $i\text{th}$ bit of the bitstring. - ds
.prefix-parity(i), which returns the parity of the subarray consisting of the first $i$ bits of the array.
Design a data structure for VDPP such that all three operations run in amortized time $O\left( \frac{\log n}{\log \log n} \right)\text,$ where $n$ is the number of bits in the sequence at the time the operation is performed. Briefly argue correctness, and prove that you meet the required amortized time bounds by defining a potential function $\Phi$ and working out the amortized costs of each operation. Weâre expecting your amortized analysis to be written up in a manner similar to the formal analyses we did in lecture of the two-stack queue, dynamic array, and B-tree construction algorithm.
Problem Three: Palos Altos
The order of a tree in a Fibonacci heap establishes a lower bound on the number of nodes in that tree (a tree of order $k$ must have at least $F_{k+2}$ nodes in it). Surprisingly, the order of a tree in a Fibonacci heap does not provide an upper bound on how many nodes are in that tree.
Prove that for any positive natural numbers $k$ and $m\text,$ thereâs a sequence of operations that can be performed, starting on an empty Fibonacci heap, so that the result Fibonacci heapâs collection of trees includes a tree of order exacty equal to $k$ that has at least $m$ nodes. (This shows that there is no upper bound on the number of nodes in a tree of a given order; for any number $m\text,$ you can force the tree to have at least $m$ nodes.)
Illustrate the sequence of operations you get with $k = 2$ and $m = 8$ and show the resulting tree. You donât need to draw out each step, but should provide enough detail to convince us (and yourself!) that your sequence indeed produces the correct tree.
Problem Four: Meldable Heaps with Addition
Meldable priority queues support the following operations:
new-pq(), which constructs a new, empty priority queue;- pq
.enqueue(v, k), which inserts element $v$ with key $k\text;$ - pq
.find-min(), which returns an element with the least key; - pq
.extract-min(), which removes and returns an element with the least key; and meld(pqâ, pqâ), which destructively modifies priority queues $pq_1$ and $pq_2$ and produces a single priority queue containing all the elements and keys from $pq_1$ and $pq_2\text.$
Some graph algorithms also require the following operation:
- pq
.add-to-all(Îk), which adds $\Delta k$ to the keys of each element in the priority queue.
Although meldable priority queue have $n$ nodes in them, itâs possible to implement add-to-all without touching all of these nodes.
-
Show how to modify an eager (non-lazy) binomial heap so that
new-pq,enqueue,find-min,extract-min, andadd-to-alleach run in worst-case $O(\log n)$ time and themeldoperation runs in time $O(\log m + \log n)\text,$ where $m$ and $n$ are the sizes of the respective heaps. Briefly argue correctness.A note on this problem: if it werenât for
meld, you could supportadd-to-allin time $O(1)\text.$ (Do you see why?) We recommend that, from the beginning, you think about how youâdmeldtogether two different heaps that have had different $\Delta k\text{`s}$ added to them.
-
Show how to modify a lazy binomial heap so that
new-pq,enqueue,find-min,meld, andadd-to-allall run in amortized time $O(1)$ andextract-minruns in amortized time $O(\log n)\text.$ Briefly argue correctness, then do an amortized analysis with the potential method to prove runtime bounds. Some hints:-
Try to make all operations have worst-case runtime $O(1)$ except for
extract-min. Your implementation ofextract-minwill probably do a lot of work, but if you've set it up correctly the amortized cost will only be $O(\log n)\text.$ This means, in particular, that you will only propagate the $\Delta k\text{'s}$ through the data structure inextract-min. -
If you only propagate $\Delta k\text{'s}$ during an
extract-minas we suggest, you'll run into some challenges trying tomeldtwo lazy binomial heaps with different $\Delta k\text{'s.}$ To address this, we recommend that you change howmeldis done to be even lazier than the lazy approach we discussed in class. You might find it useful to construct a separate data structure tracking themelds that have been done and then only actually combining together the heaps during anextract-min. -
Depending on how you set things up, to get the proper amortized time bound for
extract-min, you may need to define your potential function both in terms of the structure of the lazy binomial heaps and in terms of the auxiliary data structure hinted at by the previous point.
In your writeup, donât just describe the final data structure all at once. Instead, walk us through the design. Explain why each piece is there, why itâs needed, and how the whole structure comes together. Briefly argue correctness, and prove that you meet the required amortized time bounds by defining a potential function $\Phi$ and working out the amortized costs of each operation.
Weâre expecting your amortized analysis to be written up in a manner similar to the formal analyses we did in lecture of the two-stack queue, dynamic array, and B-tree construction algorithm.
-
Once you finish this last problem, reflect on just how far you've come over the course of the quarter. Did you imagine you'd be able to create something like this?