\documentclass[12pt]{exam}

\usepackage[utf8]{inputenc}  % For UTF8 source encoding.
\usepackage{amsmath}  % For displaying math equations.
\usepackage{amsfonts} % For mathematical fonts (like \mathbb{E}!).
\usepackage{upgreek}  % For upright Greek letters, such as \upvarphi.
\usepackage{wasysym}  % For additional glyphs (like \smiley!).
% For document margins.
\usepackage[left=.8in, right=.8in, top=1in, bottom=1in]{geometry}
\usepackage{lastpage} % For a reference to the number of pages.

% TODO: Enter your name here :)
\newcommand*{\authorname}{[Your name goes here]}

\newcommand*{\psetnumber}{4}
\newcommand*{\psetdescription}{Amortized Efficiency}
\newcommand*{\duedate}{Thursday, May 17th}
\newcommand*{\duetime}{2:30 pm}

% Fancy headers and footers
\headrule
\firstpageheader{CS166\\Spring 2018}{Problem Set \psetnumber\\\psetdescription}{Due: \duedate\\at \duetime}
\runningheader{CS166}{Problem Set \psetnumber}{\authorname}
\footer{}{\footnotesize{Page \thepage\ of \pageref{LastPage}}}{}

% Exam questions.
\newcommand{\Q}[1]{\question{\large{\textbf{#1}}}}
\qformat{}  % Remove formatting from exam questions.

% Useful macro commands.
\newcommand*{\ex}{\mathbb{E}}
\newcommand*{\bigtheta}[1]{\Theta\left( #1 \right)}
\newcommand*{\bigo}[1]{O \left( #1 \right)}
\newcommand*{\bigomega}[1]{\Omega \left( #1 \right)}
\newcommand*{\prob}[1]{\text{Pr} \left[ #1 \right]}
\newcommand*{\var}[1]{\text{Var} \left[ #1 \right]}

% Custom formatting for problem parts.
\renewcommand{\thepartno}{\roman{partno}}
\renewcommand{\partlabel}{\thepartno.}

% Framed answers.
\newcommand{\answerbox}[1]{
\begin{framed}
\hspace{\fill}
\vspace{#1}
\end{framed}}

\printanswers

\setlength\answerlinelength{2in} \setlength\answerskip{0.3in}

\begin{document}
\title{CS166 Problem Set \psetnumber: \psetdescription}
\author{\authorname}
\date{}
\maketitle
\thispagestyle{headandfoot}

\begin{questions}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Q{Problem One: Stacking the Deque}

A deque (double-ended queue, pronounced ``deck") is a data structure that acts as a hybrid between a stack and a queue. It represents a sequence of elements and supports the following four operations:
\begin{itemize}
\item deque.add-to-front(x), which adds x to the front of the sequence.
\item deque.add-to-back(x), which adds x to the back of the sequence.
\item deque.remove-front(), which removes and returns the front element of the sequence.
\item deque.remove-back(), which removes and returns the last element of the sequence.
\end{itemize}
Typically, you would implement a deque as a doubly-linked list. In functional languages like Haskell,
Scheme, or ML, however, this implementation is not possible. In those languages, you would instead implement
a deque using three stacks.

Design a deque implemented on top of three stacks. Each operation should run in amortized time $O(1)$.

The lecture slides on amortized analysis contain an exploration of how to build a queue from two stacks
such that each operation runs in amortized $O(1)$. That might be a good starting point to build from when
working on this problem.

\begin{solution}
Your solution goes here!
\end{solution}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\newpage
\Q{Problem Two: Constructing Weight-Balanced Trees}

Problem Set Two introduced weight-balanced trees in the context of trie representations, but we never actually
talked about how to build them. Let's remedy that. \smiley{}

As a refresher, a \textbf{\emph{weight-balanced tree}} is a binary search tree built from a list of keys, each of which is
annotated with a positive real-valued weight. The key placed in the root node is chosen such that the difference
in weights between its left and right subtrees is as close to zero as possible.

We're going to assume that the input is provided to us as an array of $n$ key/weight pairs $(k_1, w_1), (k_2, w_2),\dots,(k_n, w_n)$ where the keys are in sorted order. The sorting requirement makes it easier to reason about
what happens if we pick a particular key to put in the root. Plus, as you'll see later on, if the keys aren't
sorted, the cost of sorting will exceed the cost of building the weight-balanced tree!

All of the algorithms we're going to explore in this problem will follow this general template:
\begin{itemize}
\item Using some search procedure, locate the key that has the optimum left/right split.
\item Place that key into the root of the tree, then recursively build weight-balanced trees for the keys
belonging to the left subtree and to the right subtree.
\end{itemize}

The question, then, is what the best search procedure is for identifying which key gets to be in the root.

Let's begin with a useful observation. Imagine that you scan the keys from left to right, considering each
one as a candidate for the root. As you move from the left to the right, the weight in the left subtree will
monotonically increase and the weight in the right will monotonically decrease. This means that the difference
of the right subtree weight and the left subtree weight monotonically decreases across a left-to-right
scan. As a result, that the absolute value of this difference-which is the balance factor between the two
subtrees-consists of a monotone decreasing sequence (as the difference gets closer and closer to zero
from the positive direction) followed by a monotone increasing sequence (as the differences gets further
from zero in the negative direction).

We can exploit this property to find the the optimum splitting point by scanning the array from the left to
the right and looking for where the absolute value of the difference in weights increases. The key right before
the increase then belongs in the root. However, this takes time $\Theta(n^2
)$ in the worst-case. (Do you see
why?) We'll instead opt for another approach: simultaneously scan inwards from the left \emph{and} the right
sides of the array, looking for the crossover point, and stopping as soon as we find it.

%%%%%%%%%%%%%%%%%%
\begin{parts}
\part Argue that running this search procedure on an array of length $n$ takes time $\Theta(1 + \min\{k, n - k\})$,
where $k$ is the index of the optimal key.

\begin{solution}
Your solution goes here!
\end{solution}

%%%%%%%%%%%%%%%%%%
To determine the worst-case runtime of this approach, since this is a recursive algorithm, you might initially
expect that we'd want to solve a recurrence relation of some sort. We could do that, but the recurrence
is really messy because we have no idea where the optimum element is going to be:
\[
T(n) = \max_{1 \leq k \leq n} \Big\{T(k - 1) + T(n - k) + \Theta(1 + \min\{k, n - k\})\Big\}
\]

That's not a fun recurrence by any stretch of the imagination. However, using techniques from amortized
analysis, we can solve it cleanly and beautifully in a different way.

\part Prove that this algorithm's runtime is $O(n \log n)$. To do so, place credits on the keys in the array,
each of which is redeemable for $O(1)$ units of work, then show that you can pay for the entire algorithm
by spending those credits. Don't try solving the above recurrence; it's a losing game. \smiley{}

\begin{solution}
Your solution goes here!
\end{solution}

%%%%%%%%%%%%%%%%%%

Let's now see if we fan speed this algorithm up a bit.

The above algorithm takes advantage of the fact that the weight disparity monotonically decreases and
then monotonically increases as you scan inward from either side. As a result, we don't need to do a linear
scan to find the optimal key-we can use a binary search instead! We'll probe a key, then look at it and
the key after it. If the split disparities are increasing, we know the optimal key must be to the right. Otherwise,
it must be somewhere to the left. We can use this strategy to winnow the range down to a single element,
which must then be the optimal key.

For this binary search to work efficiently, we'll need a quick way of determining what the weights in a
given node's left and right subtree would be. Fortunately, it's possible to do $O(n)$ preprocessing before our
recursive algorithm so that we can query for the weights difference in time $O(1)$. (Do you see how?)

So now let's imagine that we follow the algorithm template and use binary search. Although it might seem
like this would improve over our linear scan, this algorithm turns out not to be any more efficient.

\part Prove that this algorithm runs in time $O(n \log n)$. Then show that there are arbitrarily large inputs
for which this algorithm runs in time $\Omega(n \log n)$.

\begin{solution}
Your solution goes here!
\end{solution}

%%%%%%%%%%%%%%%%%%

It's surprising that this algorithm isn't faster than before, given that we're using binary search versus linear
scans. The algorithm from part (ii) ends up being fast because the work done on each scan depends not on
the size of the overall array, but rather on the position of the optimal key within that array. The naive binary
search described above doesn't have this property; it might end up searching over the entire array regardless
of where the optimal key is. This raises a question: is there a way to modify binary search so that
the runtime depends not on the total size of the array, but just on the distance from the optimal key to the
closer of its two sides? The answer, amazingly, is yes.

Consider the following variation on binary search. Start from both edges of the array and work inward,
probing the keys $2^0,
-1, 2^1
-1, 2^2
-1, 2^3
-1, ...,$ etc. steps away from the sides of the array, until one of
those searches overshoots the optimal key. (You can tell that you've overshot by looking at the probe location
and the position one step past that location, moving away from the edge, and seeing if the split costs
are starting to increase.) Then, do a regular binary search of just the first $2^t$ elements on the appropriate
side of the array to find the optimal key, where $2^t$ is the probe size that overshot the optimum.

\part Argue that this search procedure takes time $\Theta(1 + \min\{\log k, \log n-k\})$, where $k$ is the index of
the optimal key.

\begin{solution}
Your solution goes here!
\end{solution}

%%%%%%%%%%%%%%%%%%
Our final algorithm is to use the recursive framework with this modified binary search instead of a our
prior linear searches. To see how fast this is, we could, as before, ``just" solve the following recurrence:
\[
T(n) = \max_{1 \le k \le n}\Big\{ T(k-1) + T(n-k) + \Theta(1 + \min\{\log k, \log n-k\}) \Big\}
\]
Again, solving this directly would be messy, and not fun at all. But just like in part (ii) of this problem, we
don't actually need to solve this recurrence! We can use techniques from amortization instead.

\part Prove that this algorithm runs in time $O(n)$. There are several ways you could do this. We recommend
adapting your approach from part (i) and changing where you place and spend credits. You
may find it easiest to envision spending fractional credits at various points. Alternatively, you
could directly try to solve this recurrence. (We recommend the former approach!)

\begin{solution}
Your solution goes here!
\end{solution}

So there you have it-by running two parallel modified binary searches, we can build weight-balanced
trees in time $O(n)$!

\end{parts}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\newpage
\Q{Problem Three: Meldable Heaps with Addition}
Meldable priority queues support the following operations:
\begin{itemize}
\item \texttt{new-pq()}, which constructs a new, empty priority queue;
\item $pq$.\texttt{insert(v, k)}, which inserts element v with key k;
\item $pq$.\texttt{find-min()}, which returns an element with the least key;
\item $pq$.\texttt{extract-min()}, which removes and returns an element with the least key; and
\item \texttt{meld}($pq_1, pq_2)$, which destructively modifies priority queues $pq_1$ and $pq_2$ and produces a single priority
queue containing all the elements and keys from $pq_1$ and $pq_2$.
\end{itemize}

Some graph algorithms, such as the Chu-Liu-Edmonds algorithm for finding minimum spanning trees in
directed graphs (often called ``optimum branchings"), also require the following operation:
\begin{itemize}
\item $pq$.\texttt{add-to-all}($\Delta k$), which adds $\Delta k$ to the keys of each element in the priority queue.
\end{itemize}
Using lazy binomial heaps as a starting point, design a data structure that supports all \texttt{new-pq, insert,
find-min, meld}, and \texttt{add-to-all} in amortized time $O(1)$ and \texttt{extract-min} in amortized time $O(\log n)$.
Some hints:
\begin{enumerate}
\item You may find it useful, as a warmup, to get all these operations to run in time $O(\log n)$ by starting
with an \emph{eager} binomial heap and making appropriate modifications. You may end up using some
of the techniques you develop in your overall structure.
\item Try to make all operations have worst-case runtime $O(1)$ except for \texttt{extract-min}. Your implementation
of \texttt{extract-min will probably do a lot of work}, but if you've set it up correctly the
amortized cost will only be $O(\log n)$. This means, in particular, that you will only propagate the
$\Delta k$'s through the data structure in \texttt{extract-min}.
\item If you only propagate $\Delta k$'s during an \texttt{extract-min} as we suggest, you'll run into some challenges
trying to meld two lazy binomial heaps with different $\Delta k$'s. To address this, we recommend that
you change how \texttt{meld} is done to be even lazier than the lazy approach we discussed in class. You
might find it useful to construct a separate data structure tracking the melds that have been done
and then only actually combining together the heaps during an \texttt{extract-min}.
\item Depending on how you've set things up, to get the proper amortized time bound for \texttt{extract-min},
you may need to define a potential function both in terms of the structure of the lazy binomial
heaps and in terms of the auxiliary data structure hinted at by the previous point.
\end{enumerate}

As a courtesy to your TAs, please start of your writeup by giving a high-level overview of how your data
structure works before diving into the details.

\begin{solution}
Your solution goes here!
\end{solution}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\Q{Problem Four: Static Optimality in Practice}

This one is all coding! Make sure to submit your code on myth.

\end{questions}
\end{document}