Section 6. Dynamic Arrays and Hashing


Section materials curated by Neel Kishnani, drawing upon materials from previous quarters.

This week’s section exercises explore the ins and outs of content from weeks 6 and 7, focusing on class design and hash functions!

Each week, we will also be releasing a Qt Creator project containing starter code and testing infrastructure for that week's section problems. When a problem name is followed by the name of a .cpp file, that means you can practice writing the code for that problem in the named file of the Qt Creator project. Here is the zip of the section starter code:

📦 Starter code

Dynamic Arrays

Problem One: Revisiting Our Stack

In class, we implemented the OurStack class, which represents a stack of integers. The class definition is shown here:

class OurStack {
public:
    OurStack();
    ~OurStack();

    void push(int value);
    int pop();
    int peek() const;

    int size() const;
    bool isEmpty() const;

private:
    void grow();

    int* elems;
    int logicalSize;
    int allocatedSize;
};

Here’s the .cpp file:

#include "OurStack.h"
#include "error.h"
const int kInitialSize = 4;

OurStack::OurStack() {
    elems = new int[kInitialSize];
    logicalSize = 0;
    allocatedSize = kInitialSize;
}

OurStack::~OurStack() {
    delete[] elems;
}

void OurStack::push(int value) {
    if (logicalSize == allocatedSize) {
        grow();
    }
    elems[logicalSize] = value;
    logicalSize++;
}

int OurStack::peek() const {
    if (isEmpty()) {
        error("What is the sound of one hand clapping?");
    }
    return elems[logicalSize - 1];
}

int OurStack::pop() {
    int result = peek();
    logicalSize--;
    return result;
}

int OurStack::size() const {
    return logicalSize;
}

bool OurStack::isEmpty() const {
    return size() == 0;
}

void OurStack::grow() {
    int* newArr = new int[2 * allocatedSize];
    for (int i = 0; i < size(); i++) {
        newArr[i] = elems[i];
    }
    delete[] elems;
    elems = newArr;
    allocatedSize *= 2;
}

This problem consists of some questions about the OurStack type.

  1. What is the meaning of the term “logical size?” How does it compare to the term “allocated size?” What’s the relationship between the two?

    The logical size of the stack is the actual number of elements stored in the stack, whereas the allocated size is the length of the elems array. The logical length can be any number between 0 (inclusive) and allocatedSize (exclusive) depending on how many pushes and pops have been made.


  1. What is the OurStack() function? Why is each line in that function necessary?

    This is the constructor. It’s responsible for setting all the variables in the OurStack class to a reasonable set of defaults. The lines here set up the actual memory where the elements will be stored, configure the initial logical length (0 elements stored), and the initial allocated length (which is the default initial size given by the constant.)


  1. What is the ~OurStack() function? Why is each line in that function necessary? Why isn’t there any code in there involving the logicalSize or allocatedSize data members?

    This is the destructor. Its job is to deallocate any extra memory or resources that were allocated by the OurStack class and not otherwise automatically reclaimed. Here, we deallocate the memory we allocated earlier by using delete[]. We don’t need to do anything to logicalSize or allocatedSize because there were no dynamic allocations performed to get space for them. Generally speaking, you only need to explicitly clean up resources that you explicitly allocated.


  1. In the OurStack::grow() function, one of the lines is delete[] elems, and in the next line we immediately write elems = newArr;. Why is it safe to do this? Doesn’t deleting elems make it unusable?

    The statement delete[] elems; means “destroy the memory pointed at by elems” rather than “destroy the elems variable.” As a result, after the first line executes, elems is still a perfectly safe variable to reassign.


  1. The OurStack::pop() function doesn’t seem to have any error-checking code in it. What happens if you try to pop off an empty stack?

    Notice that OurStack::pop calls OurStack::peek, which in turn has its own error-checking. As a result, trying to pop off an empty stack will trigger the custom error message from the OurStack::peek member function.


  1. What is the significance of placing the OurStack::grow function in the private section of the class? What does that mean? Why didn’t we mark it public?

    This is a private member function. This is a member function that’s used as a part of the implementation of the class rather than the interface. Any clients of OurStack can’t access this function, which is a Good Thing because we don’t want rando’s coming along and increasing our stack’s capacity unnecessarily.


Problem Two: Shrinking the Stack

Right now, our stack doubles its allocated length whenever it runs out of space and needs to grow. The problem with this setup is that if we push a huge number of elements into our stack and then pop them all off, we’ll still have a ton of memory used because we never shrink the array when it’s mostly unused.

  1. Explain why it would not be a good idea to cut the array size in half whenever fewer than half the elements are in use.

    Imagine we have a stack with logical size n - 1 and allocated size n. If we perform two pushes, we’ll double our stack’s allocated size to 2 n, which takes time O(n), and increase the logical size to n+1. If we now do two pops, we’ll drop the logical size down to n-1, which is less than half the allocated size. If we now shrink the allocated size by half down to n, we’ll have to do O(n) work transferring elements over, ending with a stack with logical size n - 1 and allocated size n. Overall, we’ve done two pushes and two pops, we’re right back where we’ve started, and we’ve done O(n) work. That’s a lot of work for very little payoff!


  1. Update the code for OurStack so that whenever the logical length drops below one quarter of the allocated length, the array is reallocated at half its former length. As an edge case, ensure that the allocated length is always at least equal to the initial allocated capacity. This setup maintains the (amortized) $O(1)$ cost of each insertion and deletion and ensures that the memory usage is always $O(n)$.

    There are many ways we can do this. We’re going to do this by replacing the OurStack::grow function with a new OurStack::setCapacity function that lets us say what the new allocated size will be. Here’s the updated class definition:

    class OurStack {
    public:
        OurStack();
        ~OurStack();
    
        void push(int value);
        int pop();
        int peek() const;
    
        int size() const;
        bool isEmpty() const;
        
    private:
        void setCapacity(int capacity);
    
        int* elems;
        int logicalSize;
        int allocatedSize;
    };
    

    Here’s the updated functions from the .cpp file:

    void OurStack::push(int value) {
        if (logicalSize == allocatedSize) {
            setCapacity(allocatedSize * 2);
        }
        elems[logicalSize] = value;
        logicalSize++;
    }
    
    int OurStack::pop() {
        int result = peek();
        logicalSize--;
        if (logicalSize < allocatedSize / 4) {
            setCapacity(allocatedSize / 2);
        }
        return result;
    }
    
    void OurStack::setCapacity(int capacity) {
        /* If we're asking to reduce our size below the initial starting size, 
         * instead leave the array at its current size.
         */
        if (capacity < kInitialSize) return;
        
        int* newArr = new int[capacity];
        for (int i = 0; i < size(); i++) {
            newArr[i] = elems[i];
        }
        delete[] elems;
        elems = newArr;
        allocatedSize = capacity;
    }
    

Problem Three: Growing a Ring Buffer Queue

Remember our good friend the `RingBufferQueue from last section? Check out the problem definition from last week if you want a quick refresher. Last time we visited the RBQ it had fixed capacity – that is, it couldn't grow in size after it was initially created. How limiting! With our newfound pointer and dynamic allocation skills, we can remove this limitation on the RBQ and make it fully functional!

Add functionality to the RingBufferQueue from the previous section problem so that the queue resizes to an array twice as large when it runs out of space. In other words, if asked to enqueue when the queue is full, it will enlarge the array to give it enough capacity.

For example, say our queue can hold 5 elements and we enqueue the five values 10, 20, 30, 40, and 50. Our queue would look like this:

index    0    1    2    3    4
      +----+----+----+----+----+
value | 10 | 20 | 30 | 40 | 50 |
      +----+----+----+----+----+
         ^                   ^
         |                   |
       head                tail

If the client tries to enqueue a sixth element of 60, your improved queue class should grow to an array twice as large:

index    0    1    2    3    4    5    6    7    8    9
      +----+----+----+----+----+----+----+----+----+----+
value | 10 | 20 | 30 | 40 | 50 | 60 |    |    |    |    |
      +----+----+----+----+----+----+----+----+----+----+
         ^                        ^
         |                        |
       head                     tail

The preceding is the simpler case to handle. But what about if the queue has wrapped around via a series of enqueues and dequeues? For example, if the queue stores the following five elements:

index    0    1    2    3    4
      +----+----+----+----+----+
value | 40 | 50 | 10 | 20 | 30 |
      +----+----+----+----+----+
              ^    ^
              |    |
            tail head

If the client tries to add a sixth element of 60, you cannot simply copy the array as it is shown. If you do so, the head and wrapping will be broken. Instead, copy into the new array so that index 0 always becomes the new head of the queue. The picture will look the same as the previous one with the value 60 at index 5.

Write up the implementation of the new and improved RingBufferQueue. It should have the same members as in the previous problem, but with the new resizing behavior added. You may add new member functions to your class, but you should make them private.

First, the header file.

#pragma once

class RingBufferQueue {
public:
    /* Creates a new, empty, RingBufferQueue. */
    RingBufferQueue();
    
    /* Cleans up all memory used by the RingBufferQueue. */
    ~RingBufferQueue();
    
    /* Enqueues an item. If there is no space left, this function calls
     * error().
     */
    void enqueue(int value);
    
    /* Looks at, but does not remove, the first element of the queue.
     * If the queue is empty, calls error().
     */
    int peek() const;
    
    /* Removes and returns the first element of the queue. If the queue
     * is empty, calls error().
     */
    int dequeue();
    
    /* Returns how many elements are in the queue. */
    int size() const;
    
    /* Returns whether the queue is empty. */
    bool isEmpty() const;
    
private:
    int* elems;         // Pointer to array of elements
    int  allocatedSize; // Number of slots in that array
    int  logicalSize;   // How many are used
    int  head;          // Head position; see above diagrams
    
    void grow();        // Resize if we need more space  
};

Now, our .cpp file.

#include "RingBufferQueue.h"
#include "error.h"
using namespace std;

/* Capacity of a RingBufferQueue. */
const int kNumSlots = 5; // Matches diagram above

RingBufferQueue::RingBufferQueue() {
    /* Create our array. */
    elems = new int[kNumSlots];
    allocatedSize = kNumSlots;
    
    /* We're logically empty at this point; no items have been added. */
    logicalSize = 0;
    
    /* Head begins at the beginning - though it in principle could go
     * anywhere within the array.
     */
    head = 0;
}

RingBufferQueue::~RingBufferQueue() {
    delete[] elems;
}

/* Adds a value right after the last position. */
void RingBufferQueue::enqueue(int value) {
    /* If we're at full capacity, get more space. */
    if (size() == allocatedSize) {
        grow();
    }
    
    /* Determine where this element goes. If we didn't have any wraparound,
     * we would go at position head + size(). However, we have to account
     * for the fact that this might walk off the end of the array, in which
     * case we need to wrap around to the beginning. The modulus operator
     * is our friend here. By computing (head + size()) % allocatedSize,
     * one of the two things happen:
     *
     *   1. If head + size() < allocatedSize, then the remainder when we
     *      do the division is head + size().
     *   2. If head + size() >= allocatedSize, then the remainder of the
     *      division will be what we get when we subtract out as many
     *      copies of allocatedSize as we can.
     *
     * More generally, modding by an array size essentially says "wrap me
     * around to the right position."
     */
    elems[(head + size()) % allocatedSize] = value;
    logicalSize++;
}

void RingBufferQueue::grow() {
    /* Get twice as much space as we used to have. */
    int* newElems = new int[allocatedSize * 2];
    
    /* Copy the elements over. Start at the position given by the head, and
     * use modular arithmetic to walk forward, wrapping around if necessary.
     */
    for (int i = 0; i < size(); i++) {
        newElems[i] = elems[(head + i) % allocatedSize];
    }
    
    /* Replace the old array. */
    delete[] elems;
    elems = newElems;
    
    /* Our copy operation pushed everything to the front of the ring buffer
     * array. Update the head to look there as well.
     */
    head = 0;
    
    /* Remember that we just increased our size. */
    allocatedSize *= 2;
}

int RingBufferQueue::peek() const {
    /* Make sure there's something here to return. */
    if (isEmpty()) {
        error("Nothing to see here folks, move along.");
    }
    
    /* Look at the position given by the head. */
    return elems[head];
}

int RingBufferQueue::dequeue() {
    /* This also does error-checking for us. Nifty! */
    int result = peek();
    
    /* The head needs to move forward a step, possibly wrapping around. Above,
     * we did this using the mod operator. Just to show another strategy, we'll
     * use if statements here.
     */
    head++;
    if (head == allocatedSize) head = 0;
    
    /* As usual, we don't actually get rid of the integer we're removing. We
     * just pretend it doesn't exist.
     */
    logicalSize--;
        
    return result;
}

int RingBufferQueue::size() const {
    return logicalSize; // Number of slots used
}

bool RingBufferQueue::isEmpty() const {
    return size() == 0;
}

Hashing

Problem Four: Choosing Good Hash Functions

In CS106B, we’ll use hash functions without spending too much time talking about their internal workings. (Building a good hash function is a challenging endeavor!) To give you a sense about why this is, we’d like you to investigate four different possible hash functions.

Each of the functions shown below is designed to take as input a string and then output a hash code in the range from 0 to 9, inclusive. For each proposed hash function, tell us whether it’s

  1. not even a valid hash function;
  2. a valid hash function, but not a very good one; or
  3. a very good hash function.

Of course, you should be sure to justify your answers.

int hashFunction1(const string& str) {
  return randomInteger(0, 9); // a value between 0 and 9, inclusive
}
int hashFunction2(const string& str) {
  return 0; // Who cares about str, anyway?
}
int hashFunction1(const string& str) {
  return randomInteger(0, 9); // a value between 0 and 9, inclusive
}

This is not a valid hash function. One of the main requirements for a hash function is that if you hash the same input multiple times, you always get the same output. However, this hash function doesn’t do this, since hashing any string multiple times might produce different answers.

At a more intuitive level - imagine you’re using a hash function to distribute items into drawers. The idea is that you’d compute the hash code for the item, then put the item in the given drawer. The advantage of doing this is that if you want to see whether you have that item, you just need to look in the drawer it hashes to. But if you get different hash codes back each time, you aren’t guaranteed that you’ll be able to find the item when you hash it, since you’ll be sent to a random drawer each time!

int hashFunction2(const string& str) {
  return 0; // Who cares about str, anyway?
}

This hash code passes the first test of a hash function - if you hash the same input many times, you will get the same output. However, it absolutely fails the second test - since everything gets mapped to slot zero, this hash function doesn’t distribute items nicely over the range, and in fact hashes of wildly different strings will always end up the same! Returning to the analogy above - this hash function corresponds to the strategy of “put everything in the same drawer.” That’ll work, but your drawer will get really full!


These next three functions use the fact that each char has an associated integer value in C++. For example, on most systems the character 'A' has value 65, the character '3' has value 51, the character '!' has value 33, and the character '~' has value 126. We can use these numeric values to design hash functions for strings.

int hashFunction3(const string& str) {
  return str[0] / 10;
}
int hashFunction4(const string& str) {
  return str[0] % 10;
}
int hashFunction5(const string& str) {
  int result = 0;
  for (char ch: str) {
    result += ch; // Again, uses the numeric value of ch
  }
  /* Numeric values for characters can be positive or negative. */
  if (result < 0) {
    result = 0;
  }
  if (result >= 10) {
    result = 9;
  }
  return result;
}
int hashFunction3(const string& str) {
  return str[0] / 10;
}

There are two problems with this hash function. First, if the input string is empty, it reads off the end of the string - oops! That’s no good. (It turns out that in C++, reading one step off the end of the string is technically a well-defined operation, but I’d wager that most C++ programmers don’t know this and at a bare minimum it’s a weird thing to do.)

But for now, let’s ignore that. This hash function does indeed compute hash codes consistently (if you hash the same string many times, you’ll always get the same result). However, it does not produce hash codes that are guaranteed to be in the range from 0 to 9, since hashing the string "~" would output 12. Therefore, this is not a legal hash function.

int hashFunction4(const string& str) {
  return str[0] % 10;
}

Ignoring the issue raised above about how str[0] might be an out-of-bounds read, this hash function is deterministic (hashing the same string many times always produces the same output), and the outputs are always in the range from 0 to 9, inclusive. In fact, taking an integer and modding it by the number n will always give you a value in the range from 0 to n - 1, inclusive.

But is this a good hash function? I’d say no. Good hash functions should have the property that tiny changes to the inputs result in large changes to the output. Here, taking a string and appending a new character to it, or changing any character but the first, or deleting any character but the first will not change the hash code. You’ll get a huge number of collisions with this hash function.

int hashFunction5(const string& str) {
  int result = 0;
  for (char ch: str) {
    result += ch; // Again, uses the numeric value of ch
  }
  /* Numeric values for characters can be positive or negative. */
  if (result < 0) {
    result = 0;
  }
  if (result >= 10) {
    result = 9;
  }
  return result;
}

This hash function is deterministic (hashing the same string always produces the same result), and the way it’s structured guarantees that the range of values produced is always between 0 and 9, inclusive. However, there are two major problems with this hash function:

  1. The way that this hash function produces values in the range from 0 to 9 is biased toward producing the outputs 0 and 9. Specifically, any string made of purely positive-value characters (basically, most English text) will produce a result greater than 9, which clamps to 9. Similarly, any string made of purely negative-value characters (whether these exist depends on your OS) will clamp to 0. So in that sense, the values aren’t nicely-distributed across the whole range, and you’d expect to get lots of collisions at 0 and 9.
  2. The hash function doesn’t care about the order of the characters in the string. The strings "tarragon" and "arrogant" are anagrams of one another - they’re made of the same characters in a different order - and so they’ll have the same hash code. The same is true of “senator,” “treason,” and “atoners.” We’d like it to be the case that if we make small changes to a string (say, swap- ping the order of the letters) that we get wildly different outputs, but that’s not the case here. We hope this little foray into the design of hash functions gives you a sense as to why it’s hard to design good hash functions and why this is something we’re going to leave to the experts. 😃

Problem Five: Salting and Hashing

In lecture, we talked about how when storing passwords, it’s common to not store the passwords themselves, but rather the hash code of the password using some hash function. This ensures that if someone manages to steal the password database, they can’t immediately read off all the passwords stored in the database.

This approach - storing the hashes of the passwords rather than the passwords themselves - is much better than just storing the passwords themselves, but it’s not what’s done in practice because it doesn’t work well if the users have weak passwords.

Specifically, suppose that you have a list of the 100,000 most commonly used passwords, which isn’t too hard to find with a bit of Googling. You also have, probably nefariously, stolen the password file from a website, which is represented as a Map<string, int> mapping each user name to the hash of their password. You could then do the following to figure out the passwords for anyone using weak accounts:

  1. Compute the hash codes for the 100,000 commonly-used passwords that you know of.
  2. Look at the Map<string, int> mapping website users to the hashes of their passwords. For each hash code that matches one of the hashes you’ve computed above, output their password.

Write a function

Map<string, string> breakWeakPasswords(const Map<string, int>& stolenPasswordFile,
                                       const Set<string>& knownWeakPasswords, 
                                       HashFunction<string> hashFn);

that uses the above approach to figure out the passwords of each user who has a weak password, returning a Map associating each such user with their password. The final parameter, hashFn, represents the hash function used to compute password hashes. Then, answer the following question: if there are n users on the website and there are m weak passwords, what is the runtime of the code that you wrote, assuming hash codes can be computed in time O(1)?

In practice, it’s most common to use a strategy called salting and hashing, which works as follows, to prevent people from recovering passwords in the way you just did. We store for each user two pieces of information:

  • a randomly-chosen string called the salt, and
  • the hash code of the string salt + password, where password is the user’s password.

It’s important that the salt is chosen randomly for each user, rather than having a fixed salt used for all passwords.

First, explain why the introduction of a per-user salt means that you can’t break passwords using the code that you wrote above. Then, answer the following question: suppose you had the master password file for a website, including the salted, hashed passwords, and you wanted to find all users who have weak passwords. If there are n users and m weak passwords, how long would it take to find all users who have weak passwords? Represent your answer in big-O notation.

(A note on this problem: we’re not teaching this to you because we want you to do Nefarious Things like this. Rather, we wanted you to see what Roguish Scalawags could do to compromise a password system and to see why we take the protective steps that we take to stop them from doing so.)

Here’s one possible implementation of the function:

Map<string, string>
breakWeakPasswords(const Map<string, int>& stolenPasswordFile,
                   const Set<string>& knownWeakPasswords,
                   HashFunction<string> hashFn) {
    /* Build a map from hash codes to weak passwords. 
     * This takes time O(m log m) since the addition 
     * into the Map takes logarithmic time. Let's say 
     * m here is the number of knownWeakPasswords. */
    Map<int, string> weakHashes;
    for (string password: knownWeakPasswords) {
        weakHashes[hashFn(password)] = password;
    }
    /* A user has a bad password if the hash is known. This takes time 
     * O(n log m) since containsKey runs in logarithmic time. Recall 
     * that the size of weakHashes is m from above, and n here is the 
     * number of keys in stolenPasswordFile.
     */
    Map<string, string> result;
    for (string username: stolenPasswordFile) {
        int hashedPassword = stolenPasswordFile[username];
        if (weakHashes.containsKey(hashedPassword)) {
            result[username] = weakHashes[hashedPassword];
        }
    }
    return result;
}

This code takes time O(m log m + n log m), which means that it’s a very efficient way to find everyone with a weak password. Think about how this function scales and what has to happen for its output to double.

The above attack works because we can precompute the hashes of common passwords fairly quickly. If we add in a per-user salt, we can’t do this any more. For example, suppose someone has the very weak password password. If they have a random salt, it’s as though their password really was some long random string of characters followed by password, and the hash code of that string is likely to be completely different than that of the hash code for password itself. We therefore can’t precompute the hash codes of all the weak passwords and then see if anyone has those hash codes, because the hash codes stored per person will be quite different from the hash codes of the raw passwords.

We could still go one person at a time and try hashing all the weak passwords along with the salt to see if that matches the stored hash for the person, but that’s going to be much slower than before. We have to now do O(m log m) work for each of the O(n) users for a total runtime of O(m log m * n). If m is, say, 100,000 and n is, say, 100,000,000, it’s going to take us a long time to find everyone with weak passwords, long enough for the company to announce that there’s been a breach and urge everyone to change passwords.


Problem Six: Rehashing

In our first lecture on hashing, we described chained hashing as a way of implementing a set. As a reminder, with chained hashing, we distribute items to store across an array of buckets using a hash function, with each bucket holding all the items that hash to that position.

This is a very fast strategy for storing items, provided that the load factor, the ratio of the number of items to the number of buckets, is low. Once we've inserted too many values, we need to create a larger array of buckets and move items from the old buckets to the new.

Modify the OurSet type from lecture so that it rehashes whenever the load factor exceeds $2$.

Here's the revised header file:

#pragma once

#include <string>
#include "vector.h"
#include "HashFunction.h"

class OurSet {
public:
    OurSet();

    bool contains(const std::string& value) const;
    void add(const std::string& value);

    int  size() const;
    bool isEmpty() const;

private:
    /* List of buckets. Each bucket is a list of the strings in
     * that bucket.
     */
    Vector<Vector<std::string>> buckets;

    /* Hash function to distribute items into buckets. */
    HashFunction<std::string> hashFn;

    /* Total number of elements, different from number of buckets. */
    int numElems;
    
    /* Rehashes the items into a bigger set of buckets. */
    void rehash();
};

Here's the revised implementation:

#include "OurSet.h"
using namespace std;

const int kInitialBucketCount = 6; // Or really anything, really.
const double kMaxLoadFactor = 2;

OurSet::OurSet() {
    Vector<Vector<string>> ourBuckets(kInitialBucketCount);
    HashFunction<string> ourHashFn(kInitialBucketCount);

    buckets = ourBuckets;
    hashFn  = ourHashFn;
    numElems = 0;
}

int OurSet::size() const {
    return numElems;
}

bool OurSet::isEmpty() const {
    return size() == 0;
}

/* Jump to the right bucket, then see if that bucket has our
 * element.
 */
bool OurSet::contains(const string& str) const {
    int bucket = hashFn(str);
    for (string elem: buckets[bucket]) {
        if (elem == str) return true;
    }
    return false;
}

void OurSet::add(const string& str) {
    /* No duplicates. */
    if (contains(str)) return;

    /* Add this to the correct bucket. */
    int bucket = hashFn(str);
    buckets[bucket] += str;
    numElems++;

    /* If our load factor exceeds the maximum, rehash. */
    if (double(numElems) / buckets.size() > kMaxLoadFactor) {
        rehash();
    }
}

void OurSet::rehash() {
    /* We need new buckets and a new hash function, since our old hash
     * function is designed to only work on a smaller range.
     */
    Vector<Vector<string>> newBuckets(buckets.size() * 2);
    HashFunction<string> newHashFn(buckets.size() * 2);

    /* Redistribute all elements into the new buckets. */
    for (int i = 0; i < buckets.size(); i++) {
        for (string elem: buckets[i]) {
            int newBucket = newHashFn(elem);
            newBuckets[newBucket] += elem;
        }
    }

    /* Update master list of buckets and our hash function. */
    buckets = newBuckets;
    hashFn = newHashFn;
}