Problem Set 2

Due Thursday, April 24 at 1:00 pm Pacific

Solutions are available!

🔑 Solutions

This assignment may be completed individually or with a partner.

This problem set is all about how to represent data structures using as few bits as possible. By the time you're done, you'll have a much better feel for how these data structures work and how to apply the techniques from lecture to novel problems.

$\LaTeX$ Template

Assignments are required to be typed. We've provided a $\LaTeX$ template for this assignment, though you're not required to use it.

🖋 PS2 $\LaTeX$ Template

Problem One: Why Ignore the Bitvector?

Our succinct rank structure ended up using $O \left(\frac{n \log \log n}{\log n}\right)$ bits of memory on top of the original $n$ bits needed to store the $n$-bit input array. You might be wondering why we didn't count the $n$ bits of the input as part of the total, since the succinct rank structure needs access to it to compute rank queries. This problem explores why.

Suppose you want to design a data structure that stores an arbitrary $n$-bit array. Prove that your data structure must use at least $n$ bits of memory in the worst case, regardless of how clever you are with your implementation. (Hint: Use the pigeonhole principle.)

Using your result from part (i), prove that any data structure that supports rank queries on arbitrary $n$-element bitvectors must use $n$ bits of space in the worst case. When counting up bits used by the data structure, we'll include both the bits from the data structure itself, plus any bits of the original bitvector that the data structure still needs access to. (Hint: Show how to reconstruct an $n$-element bitvector using a rank query structure over that bitvector.)

Because of your result from part (ii), any data structure for rank will always have to use $n$ bits of storage in the worst case. We excluded the $n$ bits of the input from our space calculations by counting those as the $n$ bits from the lower bound, focusing instead on the additional space needed beyond that.

Problem Two: On Powers of Logarithms

In our lecture on succinct select, we devised a hybrid solution to succinct select that used space $O(\frac{n \log \log n}{\log n})$ and answered queries in time $O(\log \log n)\text.$ This data structure involved two parameters: our chunk size $c$ and our threshold size $t\text.$ We ultimately chose these to be $c = \log^2 n$ and $t = \log^4 n\text.$

Suppose we use the exact same data structure, except that we choose $c = \log^5 n$ and $t = \log^{13} n\text.$ What will happen to the space usage and query time of the data structure?

And now, a purely optional problem you don't need to submit for credit: the final version of the succinct select data structure we came up with ended up using space $O \left( \frac{n \log \log n}{\sqrt{\log n}} \right)\text.$ Is it possible to improve this space usage by choosing $c\text,$ $t\text,$ $c'\text,$ and $t'$ more intelligently? If so, let us know how!

Problem Three: Implementing Succinct Rank

This problem involves writing C++ code. You can find the starter files at /usr/class/cs166/assignments/a2/. As usual, use cp -r to copy the files and /usr/class/cs166/bin/submit to submit them. Your code needs to compile without warnings (-Wall -Werror) and run cleanly under valgrind. And, of course, your code should be well-commented and so beautiful to behold that it brings tears to your eyes. Check the README for details about how to build and run everything.

Your task in this problem is to implement the succinct rank data structure we described in lecture. As a reminder, that data structure works as follows:

Split the input into blocks of size $\Theta(\log^2 n)\text,$ and further subdivide those blocks into miniblocks of size $\frac{1}{2} \log_2 n\text.$
Create a top-level array populated with the prefix sums at the start of each block.
Create a second-level array populated with the relative prefix sums of each miniblock within their enclosing blocks.
Create a Four Russians table populated with all possible answers to rank queries within all possible miniblocks.
Answer rank queries by summing together the appropriate entries from the top-level, second-level, and Four Russians tables.

This assignment will involve working with unusually-sized integers. To assist you with this, we've provided you with the following helper functions and types:

bitAt, integerAt, and writeIntegerAt, the functions you coded up in Assignment 0. (Foreshadowing!) You'll almost certainly want to use bitAt and integerAt, but (for reasons detailed below) you shouldn't need to use writeIntegerAt.
IntArray is a type representing an array of integers of a fixed bit size (e.g. 3-bit integers, 18-bit integers, etc.). You can construct an IntArray using the syntax
```
IntArray arr(BitCount(bitsPerInteger), numIntegers);
```
and you can read/write individual integers using the array selection operator:
```
arr[137] = value;
uint64_t value = arr[166];
```
Through the magic of C++, we've set up the syntax arr[index] = value to extract the correct number of bits from value to store in the array item at the indicated slot. Behind the scenes, all reads and writes to an IntArray use the integerAt and writeIntegerAt functions you coded up in Problem Set 0. This means that, strictly speaking, you don't need to use IntArray, but having coding this up both and without IntArray we highly recommend that you lean heavily on IntArray!
divideAndRoundUp(a, b), which computes $\ceil{\frac{a}{b}}\text.$ You'd be surprised how often you'll want to use this!
std::log2(n), though admittedly this is defined in <cmath> rather than something we wrote for you.

There are a number of important implementation details you will need to address in the course of coding up your solution. Here's a quick sampler:

What happens if the number of bits in the input array isn't a clean multiple of $\frac{1}{2} \log_2 n\text?$
What happens if the number of bits in a block isn't a clean multiple of the number of bits in a miniblock?
What happens if the last block doesn't have the full number of miniblocks?
What happens on really small inputs?
What are the specific indices into each of the tables you need to look up?
What happens if you do a rank query over the entire bit array?
Exactly how many bits go into each number?

We will leave it up to you to decide how to answer these questions. You can approach this however you'd like as long as (1) your general strategy follows what's outlined above, (2) the overall structure uses $O\left(\frac{n \log \log n}{\log n} \right)$ bits of auxiliary space beyond the input bit array, (3) your preprocessing time is $O\left(n \right)\text,$ and (4) your query time is $O(1)\text.$

While in principle you could code this up all in one go, we recommend breaking the task down into smaller, easily-tested milestones. Here is our recommendation, which you are not required to follow:

As a warmup, implement the $\langle O(n \log n), O(1) \rangle$ precompute-all solution for binary ranking: create an IntArray holding the answers to all possible rank queries and return the precomputed answers when the structure is queried.
Now, update your solution so that it uses the $\left \langle O \left( \frac{n \log n}{b} \right), O(b) \right \rangle$ solution for binary ranking with $b = \log^2 n\text:$ split the input into blocks of $\log^2 n$ bits each, precompute the prefix sums at the start of each block, and answer queries through a combination of table lookups and linear scans.
Now, use a modified version of the two-layer structure from lecture. Split each block into miniblocks of size $\frac{1}{2} \log_2 n\text,$ precompute answers at the start of each miniblock (relative to the block they're contained in), and answer queries via a lookup in the top-level table, a lookup in your newly-created table, and a final linear scan.
Finally, remove the remaining linear scan by constructing a Four Russians table for the miniblocks. You've just completed your implementation!

This problem will involve a decent amount of testing and debugging. Make judicious use of gdb and valgrind as you're developing your solution, along with appropriate debug printouts. Comment your code extensively to document your design decisions, to facilitate debugging, and to confirm you understand all the details. Decompose larger functions into smaller ones to reduce the surface area for mistakes. Add targeted test cases to validate that everything looks right. And have a sheet of paper handy to compare the actual values against the expected ones.

Problem Four: Compacting Fischer-Heun RMQ

The Fischer-Heun RMQ structure we covered in the first week of the quarter solved RMQ with $O(1)$ query time and $O(n)$ preprocessing time. As you saw last week when coding it up, it uses $O(n)$ words of memory. For very large arrays, this may not be feasible. In this problem, you'll see how to substantially reduce that space usage.

The Fischer-Heun data structure makes use of a sparse table RMQ structure for its summary. Briefly explain why a sparse table constructed over an $n$-element array requires $O(n \log^2 n)$ bits of memory. (Remember that we're assuming RMQ structures output the index of the smallest element of each range, not the element itself.)

Before we look at the number of bits used by the full Fischer-Heun structure, let's make a couple of simplifications to the data structure to eliminate some inefficiencies. Specifically:

In lecture, we formed our summary array by writing down the value of the minimum element in each block. To reduce the number of bits needed, we'll instead have our summary array consist of the index of the minimum value in each block. It's still possible to construct the sparse table given this summary array, albeit with an extra level of indirection.
In lecture, we had each block in the Fischer-Heun structure store a pointer to its associated RMQ structure. To avoid using pointers, we'll instead do the following. We'll form an array of $4^b$ block-level RMQ structures, where $b = \frac{1}{4} \log_2 n$ is the block size, indexed by Cartesian tree number. We'll then form an array storing the Cartesian tree numbers of each block. To look up the block-level RMQ structure associated with a block, we read its Cartesian tree number, then use that as the index into the array of block-level RMQ structures.

This means that the cleaned-up Fischer-Heun structure will consist of

a summary array $S$ consisting of the indices of the minimum element of each block;
a sparse table $T$ constructed on top of the array $S\text;$
an array $C$ of the Cartesian tree numbers of each block; and
an array $B$ of all possible block-level RMQ structures, keyed by Cartesian tree number.

These changes do not change the asymptotic query or preprocessing time for Fischer-Heun.

Given the above details of the Fischer-Heun representation, what is the space complexity, in bits, of a Fischer-Heun structure? Give your answer using big-O notation. Address each of the four above pieces in your answer.

Using the techniques we covered in lecture, show how to improve this space usage to $2n + o(n)$ bits while preserving the $O(n)$ preprocessing time and $O(1)$ query time of the original Fischer-Heun structure. As a hint, look at how succinct rank and succinct select each managed to get their space usage down below $o(n)$ while using the Method of Four Russians. You'll need to do a little bit of surgery on the Fischer-Heun structure to accomplish this, but it should not require a radical rethink of how to solve RMQ.

To make it easier to present your following structure, please do the following:
- Give an inventory of the various arrays, tables, etc. in your final structure along the lines of the bulleted list above of the four components of the "vanilla" Fischer-Heun structure. You may find it helpful to also draw a schematic representation of the data structure.
- Along the lines of what you did in part (ii) of this problem, work out the number of bits required by each part of the data structure you outlined.
- List all the steps required during preprocessing, and account for the time required to complete each step.
- Briefly describe all the steps required to answer a query, and account for the time required to complete each step.

Surprisingly, this is not the end of the story for RMQ. It's possible to reduce the space usage of the Fischer-Heun structure to $\frac{2n}{t} + o(n)$ bits with $O(t)$ query time for any $t = O(1)\text,$ assuming you have access to the original array of values. Even more surprisingly, it's possible to reduce the space usage to $2n + o(n)$ without needing to store the original array. Check out the paper "Improved Range Minimum Queries" by Ferrada and Navarro for details. Moreover, this is the best possible space complexity assuming queries run in time $O(1)\text;$ check out "Lower Bound for Succinct Range Minimum Query" by Liu and Yu for details.