Section7: Trees


Section materials curated by our Head TA Nick Bowman, drawing upon materials from previous quarters.

This week’s section exercises are all about trees, particularly binary search trees and common tree idioms and algorithms. Trees are yet another way to organize the way that data is stored and they are perhaps one of the most powerful paradigms for data storage that we've encountered so far! The recursive structure of makes writing recursive functions very natural, so we will be writing a lot of recursive functions when working with trees. After you're done working through this section handout, you'll truly know what it means to party with trees!

Remember that every week we will also be releasing a Qt Creator project containing starter code and testing infrastructure for that week's section problems. When a problem name is followed by the name of a .cpp file, that means you can practice writing the code for that problem in the named file of the Qt Creator project. Here is the zip of the section starter code:

📦 Starter code

For all the problems in this handout, assume the following structures have been declared:

struct TreeNode {
    int data;
    TreeNode *left;
    TreeNode *right;

    // default constructor does not initialize
    TreeNode() {}   

    // 3-arg constructor sets fields from arguments
    TreeNode(int d, TreeNode* l, TreeNode* r) {  
        data = d;
        left = l;
        right = r;
    }
};

1) No, You're Out of Order!

Write the elements of each tree below in the order they would be seen by a pre-order, in-order, and post-order traversal. Which of these three trees (if any) are a binary search tree? Is there anything about the results of the traversals that helped you make this identification?

a) A tree with root node 3. Node 3 has left child 5 and right child 2. Node 5 has left child 1 and no right child. Node 1 has no children (leaf node). Node 2 has left child 4 and right child 6. Both nodes 4 and 6 have no children (leaf nodes)

b) A tree with root node 5. Node 5 has left child 2 and right child 8. Node 2 has left child 1 and right child 4. Node 1 has no children (leaf node). Node 4 has left child 3 and no right child. Node 3 has no children (leaf node). Node 8 has left child 7 and right child 9. Node 7 has left child 6 and no right child. Node 6 has no children (leaf node). Node 9 has no children (leaf node).

c) A tree with root node 2. Nide 2 has no left child and right child 1. Node 1 has left child 7 and right child 6. Node 7 has left child 4 and no right child. Node 4 has left child 3 and right child 5. Nodes 3 and 5 both have no children (leaf nodes). Node 6 has no left child and right child 9. Node 9 has left child 8 and no right child. Node 8 has no children (leaf node).

a.
pre-order: 3 5 1 2 4 6
in-order: 1 5 3 4 2 6
post-order: 1 5 4 6 2 3
b.
pre-order: 5 2 1 4 3 8 7 6 9
in-order: 1 2 3 4 5 6 7 8 9
post-order: 1 3 4 2 6 7 9 8 5
c.
pre-order: 2 1 7 4 3 5 6 9 8
in-order: 2 3 4 5 7 1 6 8 9
post-order: 3 5 4 7 8 9 6 1 2

In this case, only the second tree is a binary search tree, since it's nodes maintain the property that all nodes to the left of any given node are less than it and all nodes to the right of any given node are greater than it. We can tell from the in-order traversal that it is a binary search tree because the numbers are printed out in order, which is a cool property of BSTs!


2) Binary Tree Insertion

Draw the binary search tree that would result from inserting the following elements in the given order.

Here's the alphabet in case you need it! ABCDEFGHIJKLMNOPQRSTUVWXYZ

a. Jaques, Sunny, Klaus, Violet, Beatrice, Bertrand, Kit, Lemony
b. Leslie, Ron, Tom, Jerry, Larry, Garry, April, Andy
c. Aaron, Andrew, Chris, Colin, Jason, Leslie, Wesley


3) Binary Search Tree Warmup

Binary search trees have a ton of uses and fun properties. To get you warmed up with them, try working through the following problems.

First, draw three different binary search trees made from the numbers 1, 2, 3, 4, 5, 6, 7, 8, and 9. What are the heights of each of the trees you drew? What’s the tallest BST you can make from those numbers? How do you know it’s as tall as possible? What’s the shortest BST you can make from those numbers? How do you know it’s as short as possible?

Take one of your BSTs. Trace through the logic to insert the number 10 into that tree. Then insert 3.5.What do your trees look like?

This image depitcs a degenerate binary search tree, with height equal to the total number of nodes in the tree. One way to have a degenrate tree like this is to have root node 9, which has a single child 1, which has a single child 8, which has a single child 2, which has a single child 7, which has a single child 3, which has a single child 6, which has a single child 4, which has a single child 5, which is a leaf node.

There are several trees that are tied for the tallest possible binary search tree we can make from these numbers, one of which is shown to the right. It has height eight, since the height measures the number of links in the path from the root to the deepest leaf. We can see that this is the greatest height possible because there’s exactly one node at each level, and the height can only increase by adding in more levels. A fun math question to ponder over: how many different binary search trees made from these numbers have this height? And what’s the probability that if you choose a random order of the elements 1 through 9 to insert into a binary search tree that you come up with an ordering like this one?

Similarly, there are several trees tied for the shortest possible binary search tree we can make from these numbers, one of which is shown below. It has height three, which is the smallest possible height we can have. One way to see this is to notice that each layer in the tree is, in a sense, as full as it can possibly be; there’s no room to move any of the elements from the deeper layers of the tree any higher up:

This image depicts a tree with root node 5. Node 5 has left child 2 and right child 8. node 2 has left child 1 and right child 4. Node 1 is a leaf node. Node 4 has right child 3 (which is a leaf node) and no right child. Node 8 has left child 7 and right child 9 (which is a leaf node). Node 7 has a left child 6 (which is a leaf node) and no right child.

If we insert 10, we’d get the following:

This is a tree that is the same as described in the previous image, with the modification that 10 is now added as the right child of 9, which is no longe ra leaf node.


4) How Tall is That Tree? (height.cpp)

Write a function

int height(TreeNode *node)

that calculates the height of the provided tree. The height of a tree is defined to be the number of levels that a tree has, or phrased differently, the number of nodes along the longest path from the root to a leaf. For example, an empty tree has height 0. A tree of one node has height 1. A node with one or two leaves as children has height 2, etc.

int height(TreeNode *node) {
    if (node == nullptr) {
        return 0;
    } else {
        return 1 + max(height(node->left), height(node->right));
    }
}

5) Tree-Quality (equal.cpp)

Write a function

bool areEqual(TreeNode* one, TreeNode* two);

that take as input pointers to the roots of two binary trees (not necessarily binary search trees), then returns whether the two trees have the exact same shape and contents.

Let’s use the recursive definition of trees! The empty tree is only equal to the empty tree. A nonempty tree is only equal to another tree if that tree is nonempty, if the roots have the same values, and if the left and right subtrees of those roots are the same. That leads to this recursive algorithm:

bool areEqual(TreeNode* one, TreeNode* two) {
    /* Base Case: If either tree is empty, they had both better be empty. */
    if (one == nullptr || two == nullptr) {
        return one == two; // At least one is null
    }
 
    /* We now know both trees are nonempty. Confirm the root values match and
    * that the subtrees agree.
    */
    return one->data == two->data 
           && areEqual(one->left, two->left) 
           && areEqual(one->right, two->right);
}

6) Count Left Nodes (countleft.cpp)

Write a function

int countLeftNodes(TreeNode *node)

that takes in the root of a tree of integers and returns the number of left children in the tree. A left child is a node that appears as the root of a left-hand subtree of another node. For example, the tree in problem 1(a) has 3 left children (the nodes containing 5, 1, and 4).

int countLeftNodes(TreeNode *node) {
    if (node == nullptr) {
        return 0;
    } else if (node->left == nullptr) {
        return countLeftNodes(node->right);
    } else {
        return 1 + countLeftNodes(node->left) + countLeftNodes(node->right);
    }
}

7) Find Your True Balance (balanced.cpp)

Write a function

bool isBalanced(TreeNode *node)

that takes in the root of a tree of integers and returns whether or not the tree is balanced. A tree is balanced if its left and right subtrees are balanced trees whose heights differ by at most 1. The empty tree is defined to be balanced. You solution may call on other functions solved in this section handout. This image has 4 panels, showing 2 examples of balanced trees and 2 examples of trees that are not balanced. The first balanced tree has a root node that has a left subtree of height 1 (only 1 node) and then a right subtree of hieght 2 (complete binary tree of 3 nodes). The second balanced tree has a root node what has a left subtree of height 2 (2 nodes extendin in a straight line with no branching) and a right subtree of height 1 (just a single node). The first example of a tree that is not balanced has a left subtree of height 2 (3 nodes with aranged as complete binary subtree) and no right subtree (hieght of 0 since there are no nodes in it). The second example of a tree that is not balanced is one in which the root node has left and right subtrees with heights 2 and 3 respectively. However, the right subtree of the root is not itself balanced, as it has no left subtree but has a right subtree of height 2.

bool isBalanced(TreeNode *node) {
    if (node == nullptr) {
        return true;
    } else if (!isBalanced(node->left) || !isBalanced(node->right)) {
        return false;
    } else {
        int leftHeight = height(node->left); // from problem 4
        int rightHeight = height(node->right);
        return abs(leftHeight - rightHeight) <= 1;
    }
}

8) Give 'Em The Axe (prune.cpp)

Write a function

void removeLeaves(TreeNode*& node)

that accepts a reference to a pointer to a TreeNode and removes the leaf nodes from a tree. A leaf is a node that has empty left and right subtrees. If t is the tree on the left, removeLeaves(t) should remove the four leaves from the tree (the nodes with data 1, 4, 6, and 0). A second call would eliminate the two new leaves in the tree (the ones with data values 3 and 8). A third call would eliminate the one leaf with data value 9, and a fourth call would leave an empty tree because the previous tree was exactly one leaf node. If your function is called on an empty tree, it does not change the tree because there are no nodes of any kind (leaf or not). You must free the memory for any removed nodes. This image contains 5 panels, wach showing a tree at different stages of the leaf removal process. The first panel is titled "before call" and has the following tree contained: Root node 7, which has left child 3 and right child 9. Node 3 has left child 1 and right child 4, both of who are leaf nodes. Node 9 has left child 6 (which is a leaf node) and right child 8. node 8 has no left child and right child 0 (which is a leaf node). The second panel is titled "After 1st call" and contains a tree with the following description: Root node 7, which has left child 3 (which is now a leaf node) and right child 9. Node 9 has no left child and has right child 8 (which is a leaf node). The third panel is titles "After 2nd call" and contains a tree with the following description: root node 7 with no left child and rigth child 9, which is now a leaf node. The fourth panel is titled "after 3rd call" and a tree with the following description: Root node 7 with no children (only node in the tree). The fifth panel is titled "after 4th call" and solely has an empty tree represented by "nullptr".

void removeLeaves(TreeNode*& node) { 
    if (node != nullptr) { 
        if (node->left == nullptr && node->right == nullptr) { 
            delete node; 
            node = nullptr; // you can do this since node is passed by reference! 
        } else { 
            removeLeaves(node->left); 
            removeLeaves(node->right); 
        } 
    } 
}

9) The Ultimate and Penultimate Values (findmax.cpp)

Write a function

TreeNode* biggestNodeIn(TreeNode* root)

that takes as input a pointer to the root of a (nonempty) binary search tree, then returns a pointer to the node containing the largest value in the BST. What is the runtime of your function if the tree is balanced? If it’s imbalanced? Then, write a function

TreeNode* secondBiggestNodeIn(TreeNode* root)

that takes as input a pointer to the root of a BST containing at least two nodes, then returns a pointer to the node containing the second-largest value in the BST. Then answer the same runtime questions posed in the first part of this problem.

We could solve this problem by writing a function that searches over the entire BST looking for the biggest value, but we can do a lot better than this! It turns out that the biggest value in a BST is always the one that you get to by starting at the root and walking to the right until it’s impossible to go any further. Here’s a recursive solution that shows off why this works:

TreeNode* biggestNodeIn(TreeNode* root) {
    if (root == nullptr) error("Nothing to see here, folks.");
    /* Base case: If the root of the tree has no right child, then the root node
    * holds the largest value because everything else is smaller than it.
    */
    if (root->right == nullptr) return root;
    
    /* Otherwise, the largest value in the tree is bigger than the root, so it’s
    * in the right subtree.
    */
    return biggestNodeIn(root->right);
}

And, of course, we should do this iteratively as well, just for funzies:

TreeNode* biggestNodeIn(TreeNode* root) {
    if (root == nullptr) {
        error("Nothing to see here, folks.");
    }
    while (root->right != nullptr) {
        root = root->right;
    }
    return root;
}

Getting the second-largest node is a bit trickier simply because there’s more places it can be. The good news is that it’s definitely going to be near the rightmost node – we just need to figure out exactly where.

There are two cases here. First, imagine that the rightmost node does not have a left child. In that case, the second-smallest value must be that node’s parent. Why? Well, its parent has a smaller value, and there are no values between the node and its parent in the tree (do you see why?) That means that the parent holds the second-smallest value. The other option is that the rightmost node does have a left child. The largest value in that subtree is then the second-largest value in the tree, since that’s the largest value smaller than the max. We can use this to write a nice iterative function for this problem that works by walking down the right spine of the tree (that’s the fancy term for the nodes you get by starting at the root and just walking right), tracking the current node and its parent node. Once we get to the largest node, we either go into its left subtree and take the largest value, or we return the parent, whichever is appropriate.

TreeNode* secondBiggestNodeIn(TreeNode* root) {
    if (root == nullptr) {
        error("Nothing to see here, folks.");
    }

    TreeNode* prev = nullptr;
    TreeNode* curr = root;
    while (curr->right != nullptr) {
        prev = curr;
        curr = curr->right;
    }

    if (curr->left == nullptr){
        return prev;
    } else {
        return biggestNodeIn(curr->left);
    }
}

Notice that all three of these functions work by walking down the tree, doing a constant amount of work at each node. This means that the runtime is O(h), where h is the height of the tree. In a balanced tree that’s O(log n) work, and in an imbalanced tree that’s O(n) work in the worst-case.


10) Checking BST Validity (bst.cpp)

You are given a pointer to a TreeNode that is the root of some type of binary tree. However, you are not sure whether or not it is a binary search tree. That is, you might get a tree like the one shown to the right, which is a binary tree but not a binary search tree. Write a function

bool isBST(TreeNode* root)

that, given a pointer to the root of a tree, determines whether or not the tree is a legal binary search tree. You can assume that what you’re getting as input is actually a tree, so, for example, you won’t have a node that has multiple pointers into it, no node will point at itself, etc.

As a hint, think back to our recursive definition of what a binary search tree is. If you have a node in a binary tree, what properties must be true of its left and right subtrees for the overall tree to be a binary search tree? Consider writing a helper function that hands back all the relevant information you’ll need in order to answer this question.

There are a bunch of different ways that you could write this function. The one that we’ll use is based on recursive definition of a BST from lecture: a BST is either empty, or it’s a node x whose left subtree is a BST of values smaller than x and whose right subtree is a BST of values greater than x.

This solution works by walking down the tree, at each point keeping track of two pointers to nodes that delimit the range of values we need to stay within.

bool isBSTRec(TreeNode* root, TreeNode* lowerBound, TreeNode* upperBound) {
    /* Base case: The empty tree is always valid.*/
    if (root == nullptr) return true;
 
    /* Otherwise, make sure this value is in the proper range. */
    if (lowerBound != nullptr && root->data <= lowerBound->data) return false;
    if (upperBound != nullptr && root->data >= upperBound->data) return false;
 
    /* Okay! We're in range. So now we just need to confirm that the left and
    * right subtrees are good as well. Notice how the range changes based on the
    * introduction of this node.
    */
    return isBSTRec(root->left, lowerBound, root)
           && isBSTRec(root->right, root, upperBound);
}

bool isBST(TreeNode* root) {
    return isBSTRec(root, nullptr, nullptr);
}