Problem Set 2

Due Thursday, April 23 at 1:00 pm Pacific

Solutions are available!

🔑 Solutions

This assignment may be completed individually or with a partner.

This problem set is all about how to represent data structures using as few bits as possible. By the time you're done, you'll have a much better feel for how these data structures work and how to apply the techniques from lecture to novel problems.

$\LaTeX$ Template

Assignments are required to be typed. We've provided a $\LaTeX$ template for this assignment, though you're not required to use it.

🖋 PS2 $\LaTeX$ Template

Problem One: Can We Go Further?

Our succinct rank structure ended up using $n + o(n)$ bits of memory: $n$ bits for the original bitvector, which we need to keep around to do lookups for the Four Russians step, plus $o(n)$ auxiliary bits for additional tables. This is within a factor of $1 + o(1)$ of optimal, and in this problem you'll see why.

Let $\mathbb{B} = \set{0, 1}\text.$ We denote by $\mathbb{B}^{\le b}$ be set of all sequences of at most $b$ bits. Now suppose you have a surjective function $f : \mathbb{B}^{\le b} \to S$ for some finite set $S\text.$ Prove that $b \ge \lg \left( \abs{S} + 1 \right) - 1\text.$

Using your result from part (i), prove that any data structure that supports rank queries on arbitrary $n$-element bitvectors must use at least $n$ bits of space in the worst case. When counting up bits used by the data structure, we'll include both the bits from the data structure itself, plus any bits of the original bitvector that the data structure still needs access to. (Hint: Come up with a function $f$ that takes as input a rank structure for a bitvector $B$ and then outputs $B$ itself.)

Problem Two: Read-Only Arrays

You have a read-only array $A$ of length $n\text.$ Each entry $A[i]$ is a nonnegative integer. Let $m = \sum_{i = 0}^{n-1}{A[i]}\text.$ That is, $m$ is the sum of all the entries of $A\text.$

Design a data structure that represents $A$ using only $m + n + o(m + n)$ total bits while giving constant-time read access to arbitrary elements of the array. Remember that the array is read-only, so you do not need to support updates to the array.

Problem Three: Implementing Succinct Rank

This problem involves writing C++ code. You can find the starter files at /usr/class/cs166/assignments/a2/. As usual, use cp -r to copy the files and /usr/class/cs166/bin/submit to submit them. Your code needs to compile without warnings (-Wall -Werror) and run cleanly under valgrind. And, of course, your code should be well-commented and so beautiful to behold that it brings tears to your eyes. Check the README for details about how to build and run everything.

Your task in this problem is to implement the succinct rank data structure we described in lecture. As a reminder, that data structure works as follows:

Split the input into blocks of size $\Theta(\log^2 n)\text,$ and further subdivide those blocks into miniblocks of size $\frac{1}{2} \log_2 n\text.$
Create a top-level array populated with the prefix sums at the start of each block.
Create a second-level array populated with the relative prefix sums of each miniblock within their enclosing blocks.
Create a Four Russians table populated with all possible answers to rank queries within all possible miniblocks.
Answer rank queries by summing together the appropriate entries from the top-level, second-level, and Four Russians tables.

This assignment will involve working with unusually-sized integers. To assist you with this, we've provided you with the following helper functions and types:

bitAt, integerAt, and writeIntegerAt, the functions you coded up in Assignment 0. (Foreshadowing!) You'll almost certainly want to use bitAt and integerAt, but (for reasons detailed below) you shouldn't need to use writeIntegerAt.
IntArray is a type representing an array of integers of a fixed bit size (e.g. 3-bit integers, 18-bit integers, etc.). You can construct an IntArray using the syntax
```
IntArray arr(BitCount(bitsPerInteger), numIntegers);
```
and you can read/write individual integers using the array selection operator:
```
arr[137] = value;
uint64_t value = arr[166];
```
Through the magic of C++, we've set up the syntax arr[index] = value to extract the correct number of bits from value to store in the array item at the indicated slot. Behind the scenes, all reads and writes to an IntArray use the integerAt and writeIntegerAt functions you coded up in Problem Set 0. This means that, strictly speaking, you don't need to use IntArray, but having coding this up both and without IntArray we highly recommend that you lean heavily on IntArray!
divideAndRoundUp(a, b), which computes $\ceil{\frac{a}{b}}\text.$ You'd be surprised how often you'll want to use this!
std::log2(n), though admittedly this is defined in <cmath> rather than something we wrote for you.

There are a number of important implementation details you will need to address in the course of coding up your solution. Here's a quick sampler:

What happens if the number of bits in the input array isn't a clean multiple of $\frac{1}{2} \log_2 n\text?$
What happens if the number of bits in a block isn't a clean multiple of the number of bits in a miniblock?
What happens if the last block doesn't have the full number of miniblocks?
What happens on really small inputs?
What are the specific indices into each of the tables you need to look up?
What happens if you do a rank query over the entire bit array?
Exactly how many bits go into each number?

We will leave it up to you to decide how to answer these questions. You can approach this however you'd like as long as (1) your general strategy follows what's outlined above, (2) the overall structure uses $o\left(n\right)$ bits of auxiliary space beyond the input bit array, (3) your preprocessing time is $O\left(n \right)\text,$ and (4) your query time is $O(1)\text.$

While in principle you could code this up all in one go, we recommend breaking the task down into smaller, easily-tested milestones. Here is our recommendation, which you are not required to follow:

As a warmup, implement the $\langle O(n \log n), O(1) \rangle$ precompute-all solution for binary ranking: create an IntArray holding the answers to all possible rank queries and return the precomputed answers when the structure is queried.
Now, update your solution so that it uses the $\left \langle O \left( \frac{n \log n}{b} \right), O(b) \right \rangle$ solution for binary ranking with $b = \log^2 n\text:$ split the input into blocks of $\log^2 n$ bits each, precompute the prefix sums at the start of each block, and answer queries through a combination of table lookups and linear scans.
Now, use a modified version of the two-layer structure from lecture. Split each block into miniblocks of size $\frac{1}{2} \log_2 n\text,$ precompute answers at the start of each miniblock (relative to the block they're contained in), and answer queries via a lookup in the top-level table, a lookup in your newly-created table, and a final linear scan.
Finally, remove the remaining linear scan by constructing a Four Russians table for the miniblocks. You've just completed your implementation!

This problem will involve a decent amount of testing and debugging. Make judicious use of gdb and valgrind as you're developing your solution, along with appropriate debug printouts. Comment your code extensively to document your design decisions, to facilitate debugging, and to confirm you understand all the details. Decompose larger functions into smaller ones to reduce the surface area for mistakes. Add targeted test cases to validate that everything looks right. And have a sheet of paper handy to compare the actual values against the expected ones.

Comment your code extensively and document your design decisions so that when we're reading your solution, we can understand what you're doing and how you're doing it. Remember that you are writing first and foremost for humans. We're expecting your final submission to be lucid and easy to interpret. A TA sitting down to grade your work should have no trouble getting up to speed with your code.

Problem Four: Compacting Fischer-Heun RMQ

The Fischer-Heun RMQ structure we covered in the first week of the quarter solved RMQ with $O(1)$ query time and $O(n)$ preprocessing time. As you saw last week when coding it up, it uses $O(n)$ words of memory. For very large arrays, this may not be feasible. In this problem, you'll see how to substantially reduce that space usage.

The Fischer-Heun data structure makes use of a sparse table RMQ structure for its summary. Briefly explain why a sparse table constructed over an $n$-element array requires $O(n \log^2 n)$ bits of memory. (Remember that we're assuming RMQ structures output the index of the smallest element of each range, not the element itself.)

Before we look at the number of bits used by the full Fischer-Heun structure, let's make a couple of simplifications to the data structure to eliminate some inefficiencies. Specifically:

In lecture, we formed our summary array by writing down the value of the minimum element in each block. To reduce the number of bits needed, we'll instead have our summary array consist of the index of the minimum value in each block. It's still possible to construct the sparse table given this summary array, albeit with an extra level of indirection.
In lecture, we had each block in the Fischer-Heun structure store a pointer to its associated RMQ structure. To avoid using pointers, we'll instead do the following. We'll form an array of $4^b$ block-level RMQ structures, where $b = \frac{1}{4} \log_2 n$ is the block size, indexed by Cartesian tree number. We'll then form an array storing the Cartesian tree numbers of each block. To look up the block-level RMQ structure associated with a block, we read its Cartesian tree number, then use that as the index into the array of block-level RMQ structures.

This means that the cleaned-up Fischer-Heun structure will consist of

a summary array $S$ consisting of the indices of the minimum element of each block;
a sparse table $T$ constructed on top of the array $S\text;$
an array $C$ of the Cartesian tree numbers of each block; and
an array $B$ of all possible block-level RMQ structures, keyed by Cartesian tree number.

These changes do not change the asymptotic query or preprocessing time for Fischer-Heun.

Given the above details of the Fischer-Heun representation, what is the space complexity, in bits, of a Fischer-Heun structure? Give your answer using big-O notation. You can ignore the space for the original array of elements itself. Address each of the four above pieces in your answer.

Using the techniques we covered in lecture, show how to improve this space usage to $2n + o(n)$ bits (plus space for the original array of elements) while preserving the $O(n)$ preprocessing time and $O(1)$ query time of the original Fischer-Heun structure. As a hint, look at how succinct rank and succinct select each managed to get their space usage down to $n + o(n)$ while using the Method of Four Russians. You'll need to do a little bit of surgery on the Fischer-Heun structure to accomplish this, but it should not require a radical rethink of how to solve RMQ.

To make it easier to present your following structure, please do the following:
- Give an inventory of the various arrays, tables, etc. in your final structure along the lines of the bulleted list above of the four components of the "vanilla" Fischer-Heun structure. You may find it helpful to also draw a schematic representation of the data structure.
- Along the lines of what you did in part (ii) of this problem, work out the number of bits required by each part of the data structure you outlined.
- List all the steps required during preprocessing, and account for the time required to complete each step.
- Briefly describe all the steps required to answer a query, and account for the time required to complete each step.

Surprisingly, this is not the end of the story for RMQ. It's possible to reduce the space usage of the Fischer-Heun structure to $\frac{2n}{t} + o(n)$ bits with $O(t)$ query time for any $t = O(1)\text,$ assuming you have access to the original array of values. Even more surprisingly, it's possible to reduce the space usage to $2n + o(n)$ without needing to store the original array. Check out the paper "Improved Range Minimum Queries" by Ferrada and Navarro for details. Moreover, this is the best possible space complexity assuming queries run in time $O(1)\text;$ check out "Lower Bound for Succinct Range Minimum Query" by Liu and Yu for details.