Problem Set 0


Due Tuesday, April 8 at 1:00 pm Pacific


Solutions are available!

🔑 Solutions


This assignment must be completed individually; you must not work with a partner.

This first assignment is designed as a refresher of the concepts that you’ve seen in the prerequisite courses leading into CS166 (namely, CS103, CS107, CS109, and CS161). We hope that you find these problems interesting in their own right. Once you’ve finished, you should be in tip-top coding and theory shape and ready to take on the new concepts from this quarter.

We are happy to help out if you need a nudge or refresher – feel free to post on EdStem.

$\LaTeX$ Template

Assignments are required to be typed. We've provided a $\LaTeX$ template for this assignment, though you're not required to use it.

đź–‹ PS0 $\LaTeX$ Template

Problem One: Course Logistics Check-In

Before starting this assignment, please take a few minutes to fill out the following Google Form, which goes over course logistics and is designed to help things run as smoothly as possible this quarter:

CS166 Course Logistics Form

Once you've submitted the form, you'll be given a completion code. Include that completion code as your answer to Problem One when submitting your written assignment to receive credit for this problem. Thanks!

Problem Two: Bootstrapping a Recurrence

In CS161, you learned that the mergesort algorithm has a runtime that satisfies the following recurrence relation:

\[T(0) = 0 \qquad T(n) = 2T\left(\floor{\frac{n}{2}}\right) + n\]

It's a classic exercise to prove that this recurrence solves to $O(n \log n)\text,$ and we assume you're familiar with one way of proving this. But in this problem, we'd like you to prove this result using a nontraditional argument called recurrence bootstrapping that we'll make use of a few times this quarter.

  1. Prove by induction that for all natural numbers $n \ge 0$ and $k \ge 0$ we have $T(n) \le \frac{n^2}{2^k} + nk\text.$ Feel free to assume that $T(n) \le n^2$ for all $n \in \naturals\text.$

  1. Using your result from part (i), and without using the Master Theorem, prove that $T(n) = O(n \log n)\text.$

Problem Three: Probability and Concentration Inequalities

The analysis of randomized data structures sometimes involves working with sums of random variables. Our goal will often be to get a tight bound on those sums, usually to show that some runtime is likely to be low or that some estimate is likely to be good. If we only know two pieces of information about those random variables – what their expected values are and that they’re nonnegative – we can get some information about how their sums behave.

  1. Let $X_1, X_2, \ldots, X_n$ be a collection of $n$ nonnegative random variables such that $\E[X_i] = 1$ for each variable $X_i\text.$ (Note that these random variables might not be independent of one another.) Prove that $\Pr[\sum_{i=1}^n{X_i} \ge 2n] \le \frac{1}{2}\text.$ You may want to use Markov’s inequality.

Sometimes you’ll find that the sort of bound you get from an analysis like part (i) isn’t strong enough to prove what you need to prove. In those cases, you might want to start looking more at the spread of each individual random variable. If, for example, you know the variances of those variables are small, you might be able to get a tighter bound.

  1. Let $X_1, X_2, \ldots, X_n$ be a collection of $n$ nonnegative random variables. As in part (i), you know that $\E[X_i] = 1$ for each variable $X_i\text.$ But now suppose you know two other facts. First, you know that each variable has unit variance (the fancy way of saying $\Var[X_i] = 1$ for each variable $X_i\text{).}$ Second, while you don’t know for certain whether these variables are independent of one another, you know that they’re pairwise uncorrelated. That is, you know that $X_i$ and $X_j$ are uncorrelated random variables for any $i \ne j\text.$ Under these assumptions, prove that $\Pr[\sum_{i=1}^n{X_i} \ge 2n] \le \frac{1}{n}\text.$ You may want to use Chebyshev’s inequality.

The analysis in part (ii) only works if the variables are pairwise uncorrelated.

  1. Give us a choice of a natural number $n \gt 0$ and $n$ random variables $X_1, X_2, \ldots, X_n$ such that

    • each $X_i$ is nonnegative,
    • $\E[X_i] = 1$ for each variable $X_i\text,$
    • $\Var[X_i] = 1$ for each variable $X_i\text,$ but
    • $\Pr[\sum_{i=1}^n{X_i} \ge 2n] \gt \frac{1}{n}\text.$

    Then, prove that your random variables have these properties. To clarify, your construction doesn’t need to work for all choices of $n\text;$ you should pick whichever $n \gt 0$ is easiest to work with.

As you saw in this problem, learning more about the distribution of random variables makes it easier to provide tighter bounds on their sums, and correlations across those variables makes it harder. This is a good intuition to have for later in the quarter, where we’ll be discussing how different assumptions on hash functions lead to different analyses of data structures.

Problem Four: Evaluating NAND Trees

A portion of this problem involves writing C++ code. Download the starter files on myth from

/usr/class/cs166/assignments/a0

and implement the functions in the NANDTree.cpp source file. (That folder just houses the raw files; there’s no git repository there, so use cp -r rather than git clone to copy the files.) Test your implementation extensively. You may want to use our provided test harness as a starting point. Feel free to add your own tests on top of ours.

Use make to build the files. Run the tests by running ./run-tests.

To receive credit on the coding component, your code should compile with no warnings (e.g. compiled with -Wall -Werror) and should run cleanly under valgrind (that is, you should have no memory errors or memory leaks). We’ll run your code on the myth machines, so we recommend you test there before submitting.


A NAND tree is a complete binary tree (each node has either no children or two children, and each layer of the tree is completely filled in) with the following properties:

  • Each leaf node is labeled either 0 or 1.
  • All internal nodes are NAND gates. A NAND gate is a logic gate that takes in two inputs and evaluates to 0 if both its inputs are 1 and to 1 if either input is 0.

We can evaluate a NAND tree by computing the value of the top-level NAND gate in the tree, which will evaluate either to 0 or to 1. (If the tree is a single leaf, the tree evaluates to the value of that leaf.) For example, the left and right trees below evaluate to 1; the middle tree evaluates to 0:

TODO: Accessible description

Here is a simple recursive algorithm for evaluating a NAND tree:

  • If the tree is a single leaf node, return the value of that node.
  • Otherwise, recursively evaluate the left and right subtrees, then apply the NAND operator to both of those values.

This algorithm takes $\Theta(n)$ time to evaluate a NAND tree with $n$ leaf nodes (do you see why?).

We can improve this algorithm using short-circuiting. The NAND operator has the following nice property: $0 \text{ NAND } x = 1$ for any choice of $x\text.$ Specifically, $0 \text{ NAND } 0 = 1\text,$ and $0 \text{ NAND } 1 = 1\text.$ This means that, as we’re evaluating subtrees, if we find that a node’s left subtree evaluates to 0, we don’t need to evaluate the right subtree. We can just return that the whole tree evaluates to 1. This gives the following algorithm, which we'll call the left-first algorithm:

  • If the tree is a single leaf node, return the value of that node.
  • Otherwise:
    • Recursively evaluate the left subtree.
    • If it evaluates to 0, return 1.
    • Otherwise, recursively evaluate the right subtree.
    • If it evaluates to 0, return 1; otherwise return 0.

In many cases, the left-first algorithm runs faster than the $\Theta(n)$-time naive algorithm. However, it is possible to construct NAND trees for which the left-first algorithm runs in time $\Theta(n)\text,$ no better than the naive approach.

  1. Implement a function

    NANDTree* badLeftFirstTreeFor(size_t numLeaves);
    

    that takes in a number of leaves, then returns a NAND tree $T$ with that many leaves such that the left-first algorithm never short-circuits when evaluating that tree.

    Some notes on this problem:

    • Your algorithm should run in time $O(n)\text,$ where $n$ is the number of leaves in the tree.
    • You can assume that numLeaves is a perfect power of two and do not need to handle the case where this isn’t true.
    • Since we’re in C++ Land, please use new to allocate nodes rather than malloc.
    • We are expecting your solution to have beautiful coding style. For example, we are expecting you to comment your code, to choose descriptive variable names, to break large functions apart into simpler, smaller helpers, etc.
    • We’ve provided some test cases you can use to check your work, but these test cases are not exhaustive. Feel free to add your own tests in as well.

Since the left-first algorithm never short-circuits on inputs produced by your algorithm, the left-first algorithm has a worst-case runtime of $\Theta(n)\text.$ More generally, any deterministic algorithm for evaluating a NAND tree that works by starting at the top of the tree and recursively evaluating the left and right subtrees will have at least one input that causes it to run in $\Theta(n)$ time, but you don't need to prove this.

Despite the $\Theta(n)$ worst-case for deterministic evaluation algorithms, there is a simple randomized algorithm for evaluating NAND trees that, on expectation, does $o(n)$ work (note that we’re using little-o notation here). The idea is simple: use the same algorithm as above, but rather than always starting with the left subtree, instead choose which subtree to evaluate first uniformly at random. We'll call this the random-first algorithm. More concretely:

  • If the tree is a single leaf node, return the value of that node.
  • Otherwise:
    • Choose one of the subtrees of the root uniformly at random and evaluate it.
    • If the value is 0, return 1.
    • Otherwise, recursively evaluate the other subtree.
    • If the value is 0, return 1; otherwise return 0.

To determine the runtime of the random-first algorithm, we will introduce two recurrence relations. Let $T_0(n)$ be the expected number of nodes explored by of the random-first algorithm on a tree with $n$ leaf nodes, assuming the root evaluates to 0. Let $T_1(n)$ be the expected number of nodes visited by the random-first algorithm on a tree with $n$ leaf nodes, assuming the root evaluates to 1.

  1. Briefly explain why the following recurrences for $T_0(n)$ and $T_1(n)$ are correct:

    \[\begin{aligned} T_0(1) &= 1, &\qquad T_0(n) &\le 2T_1\left(\frac{n}{2}\right) + 1 \\ T_1(1) &= 1, &\qquad T_1(n) &\le T_0\left(\frac{n}{2}\right) + \frac{1}{2}T_1\left(\frac{n}{2}\right) + 1 \end{aligned}\]

    Your answer should be roughly 100 words or fewer.

  1. Using the Master Theorem, prove that $T_0(n) = O\left(n^\varepsilon\right)$ for some $\varepsilon \lt 1\text.$ As a hint, rewrite $T_0(n)$ in terms of itself. Feel free to assume that $n$ is a perfect power of four and that $T_1(n) \le T_0(n)$ for all $n \in \naturals\text.$

Your result from (iii) proves that the random-first algorithm has expected sublinear runtime on all inputs, since $O(n^\varepsilon) = o(n)\text.$ This is an example of a problem where the best randomized algorithm is more efficient, on expectation, than the best deterministic algorithm in the worst case. (At least, among algorithms that work by starting at the top of the tree and recursively walking down!)

Problem Five: Unusually-Sized Integers

Some of the data structures we’ll explore this quarter involve working with arrays of integers of atypical sizes (say, an array of 10-bit numbers, or an array of 19-bit numbers). From a Theoryland perspective, this is something where we can say “a well-dressed, intelligent, attractive programmer surely could put something like this together such that each array read or write takes time $O(1)\text.$” And since you yourself are a well-dressed, intelligent, attractive programmer, we’d like you to show us exactly how this is done.

When working with arrays of bits, assume that bit 0 of a byte is its least-significant bit (the $2^0$ bit), that bit $1$ of a byte is the second-least-significant-bit (the $2^1$ bit), etc. You should also assume numbers are laid out in little-endian order, which is how the myth machines store things. For full credit, your solution should compile cleanly without warnings (-Wall -Werror) and should run cleanly under valgrind.

  1. As a warm-up, implement a function

    int bitAt(const void* bitSequence, uint64_t bitIndex);
    

    that works as follows. This function takes as input a pointer to the start of a series of bits and an index (measured in bits) into that sequence, then returns the value of the bit at that index. Your function should throw an exception of type invalid_argument if bitSequence is a null pointer. (If you’re not familiar with C++ exceptions, you can write code to the effect of

      throw invalid_argument("Polite error message here");
    

    to immediately abort the current function and report an error.)

  1. Next, implement a function

    uint64_t integerAt(const void* bitSequence,
                       uint64_t bitIndex,      
                       uint8_t numBits);       
    

    that works as follows. You’re given a pointer somewhere in memory, and an index, measured in bits (not bytes), into that sequence. You’re also given a number of bits to extract. Your function should then read an integer of numBits bits starting at that index, and return the result as a uint64_t. You should throw an invalid_argument exception if bitSequence is null (there’s no sequence of bits to read from) or $\mathtt{numBits} \gt 64$ (you can’t fit that many bits into a uint64_t).

    Your implementation of integerAt should not attempt to build up the result number one bit or one byte at a time. If you were to approach the implementation this way, then if we were to port your code to a machine with a larger integer size (say, a hypothetical 2,048-bit machine), the runtime would not be $O(1)$ with respect to the machine word size. Instead, your implementation should work by reading and writing entire uint64_t’s from the underlying array of bits, then using bitwise operators to isolate the indicated number of bits from the sequence. (You can assume that the void* points to an allocated block of memory whose size is a multiple of 64 bits, so when reading 64 bits at a time you’ll never read off the end of the array.)

  1. Finally, implement a function

    void writeIntegerAt(void* bitSequence,    
                        uint64_t bitIndex,    
                        uint8_t numBits,      
                        uint64_t bitsToWrite);
    

    This function takes in a pointer to a sequence of bits, an index (measured in bits) within that bit sequence, and a number of bits to write. The function should then take the final parameter and write its lowest numBits bits to the specified position within the bit sequence. As with integerAt, your function should read and write entire uint64_t’s rather than working one bit or byte at a time; this ensures the runtime is $O(1)$ even if the word size is increased. And, as above, throw an invalid_argument exception if bitSequence is null or if numBits is bigger than 64.

Submission Instructions

You will submit the written and coding components of this assignment separately. Specifically:

  • Submit a PDF containing your typed solutions to the written component on Gradescope.
  • Submit the coding component on myth by running

      /usr/class/cs166/bin/submit
    

    and following the instructions given at the prompt.

And that’s it! You’re done!