Trees

CS 106B: Programming Abstractions

Spring 2022, Stanford University Computer Science Department

Lecturer: Chris Gregg, Head CA: Neel Kishnani

A real binary tree


Slide 2

Announcements

  • Assignment 6 has been released

Slide 3

Today's Topics

  • Refresh on what pointer diagrams represent
  • Quick comment on references to pointers
  • Introduction to the tree data structure
    • You've already seen lots of tree structures
    • Tree terminology
    • How do we build trees programatically?
    • Writing code to traverse a tree

Slide 4

Pointer Diagram Refresh

  • The following diagram was put together by a former CS106B student:
    • Pointers have values that are memory addresses.
    • All variables and objects have their own memory address where they are located
    • The next pointer is a memory address A diagram that shows that pointers' values are memory addresses, and that even the pointers have addresses themselves.

Slide 5

A note on references to pointers

  • In assignment 6, you have to write the following two functions:
    void runSort(ListNode*& front) {
        /* TODO: Implement this function. */
    }
      
    void quickSort(ListNode*& front) {
        /* TODO: Implement this function. */
    }
    
  • Notice that each function is passed in a reference to a pointer (and Question 7 on that part of the assignment asks about this). The reference is necessary. If the parameter was not a reference, then the pointer would be passed by value, and any changes to the pointer would not be reflected in the code that called the function. Instead, the function would have to return the pointer so it could be updated in the calling function. But, it is cleaner to call the functions so they actually change the front of the list to be the first sorted element.
  • In our example from the last lecture, we didn't need to worry about this, because all of our class functions had direct access to the front class variable. This isn't the case when we don't have a class (as in the assignment).
  • See the code below that demonstrates why you need a reference to a pointer:
    #include <iostream>
    #include "console.h"
      
    using namespace std;
      
    struct ListNode {
        string data;
        ListNode *next;
    };
      
    void insertAtFrontIncorrect(ListNode* front, string s) {
        ListNode *newNode = new ListNode;
        newNode->data = s;
        newNode->next = front;
        front = newNode;
    }
      
    void insertAtFrontCorrect(ListNode*& front, string s) {
        ListNode *newNode = new ListNode;
        newNode->data = s;
        newNode->next = front;
        front = newNode;
    }
      
    void printListIncorrect(ListNode*& front) {
        cout << "[";
        while (front != nullptr) {
            cout << front->data;
            if (front->next != nullptr) {
                cout << ", ";
            }
            front = front->next;
        }
        cout << "]" << endl;
    }
      
    void printListCorrect(ListNode* front) {
        cout << "[";
        while (front != nullptr) {
            cout << front->data;
            if (front->next != nullptr) {
                cout << ", ";
            }
            front = front->next;
        }
        cout << "]" << endl;
    }
      
    int main() {
        ListNode *front = nullptr;
        cout << "inserting cat correctly" << endl;
        insertAtFrontCorrect(front, "cat");
      
        cout << "inserting dog incorrectly" << endl;
        insertAtFrontIncorrect(front, "dog");
      
        cout << "inserting bat correctly" << endl;
        insertAtFrontCorrect(front, "bat");
      
        cout << "inserting rat correctly" << endl;
        insertAtFrontCorrect(front, "rat");
      
        cout << "printing correctly:" << endl;
        printListCorrect(front);
        cout << "printing incorrectly:" << endl;
        printListIncorrect(front);
        cout << "printing correctly:" << endl;
        printListCorrect(front);
        return 0;
    }
    

    Output:

    inserting cat correctly
    inserting dog incorrectly
    inserting bat correctly
    inserting rat correctly
    printing correctly:
    [rat, bat, cat]
    printing incorrectly:
    [rat, bat, cat]
    printing correctly:
    []
    

Slide 6

Trees

  • We have seen trees in this class before, in the form of decision trees: The decision tree for determining if `cart` is reducible. `cart` branches to `art`, `crt`, `cat`, and `car`, `art` branches to `rt`, `at`, and `ar`, etc.
  • Trees can describe hirearchies: A tree with `world` at the root, `China`, `USA` and `Honduras` as children. China has two children, `Hunan` and `Shanghai`. USA has two children, `California` and `Ohio`. Honduras has one child, `Tegucigalpa`. Hunan has one child, `Changsha`. California has two children, `SF` and `LA`. Ohio has one child, `Kent`
  • Trees can describe websites: A tree that describes a website, with `div` tags, `img` tags, `p` tags, `li` tags, `a` tags, etc. Each tag can have tags that are its children
  • Trees can desceribe programs (we often call them abstract syntax trees (the following is a figure in an academic paper written by a recent CS 106 student!)
    // Example student solution
    function run() {
      // move then loop
      move();
      // the condition is fixed
      while (notFinished()) {
        if (isPathClear()) {
          move();
        } else {
          turnLeft();
        }
        // redundant
        move();
      }
    }
    

    A tree that describes a program (the code above). The tree has `run` as its root, with `move` and `while` children. `while` has `if/else` and `move` as its children. `if/else` has `isPathClear`, `move`, and `turnLeft` as its children


Slide 7

Trees are inherently recursive

  • What is a tree in computer science?
    • A tree is a collection of nodes, which can be empty. If it is not empty, there is a root node, r, and zero or more non-empty subtrees, T1, T2, โ€ฆ, Tk, whose roots are connected by a directed edge from r. A tree of letters. See the text tree below
                       (root)
              -----------A--------------
              โ†™๏ธŽ   โ†™๏ธŽ  โ†™๏ธŽ   โ†“    โ†˜๏ธŽ       โ†˜๏ธŽ
             B   C   D   E     F--      G
                    โ†™๏ธŽ   โ†™๏ธŽ โ†˜๏ธŽ   โ†™๏ธŽ โ†˜๏ธŽ โ†˜๏ธŽ      โ†˜๏ธŽ
                   H   I   J K   L M       N
                          โ†™๏ธŽ โ†˜๏ธŽ
                         O   P
  • There are N nodes and N-1 edges in a tree.
  • Tree terminology. In the diagram above:
    • B is a child of A, and F is a child of A and a parent of K, L, M.
    • Nodes with no children are leaves. Nodes B, C, H, I, K, L, M, N, O, and P are all leaves.
    • Nodes with the same parent are siblings. I and J are siblings. K, L, and M are siblings. Nodes O and P are siblings. Nodes H and N do not have any siblings (only children).

Slide 8

More Tree Terminology

Most of the same tree as above, but smaller (see below)

                       (root)
                         A--------------
                         โ†“    โ†˜๏ธŽ        โ†˜๏ธŽ
                         E     F         G
                        โ†™๏ธŽ โ†˜๏ธŽ   โ†™๏ธŽ โ†˜๏ธŽ โ†˜๏ธŽ       โ†˜๏ธŽ
                       I   J K   L M       N
                          โ†™๏ธŽ โ†˜๏ธŽ
                         O   P
  • We can define a path from a parent to its children. The path from A to O is: A-E-J-O.
  • The path A-E-J-O has a length of three (the number of edges).
  • The depth of a node is the length from the root. The depth of node J is 2. The depth of the root is 0.
  • The height of a node is the longest path from the node to a leaf. The height of node F is 1. The height of all leaves is 0.
  • The height of a tree is the height of the root. The height of the above tree is 3.

Slide 9

One Parent, No Cycles

An image of a real parent holding a child's hand, and an image of a bicycle with a red 'no' sign through it

  • Tree nodes can only have one parent, and they cannot have cycles Two images -- one with a non-tree that has two nodes that share a child (nodes S and N both share a child, A). The second image is also a non-tree. Node S has two children, T and A, and node N has a single child, A. So, the A is (again) shared, and this is not a tree.
  • Not a tree (A has two parents):
    S   N
     โ†˜๏ธŽ โ†™๏ธŽ
      A
    
  • Again: not a tree (A has two parents):
      S    N
     โ†™๏ธŽ  โ†˜๏ธŽ โ†™๏ธŽ
    T    A
        โ†™๏ธŽโ†˜๏ธŽ 
       F   O
    
  • The following are not trees, either, because you can cycle through the tree. Nodes cannot have children that are on the same level or above on the tree: Two images -- both have nodes that cycle back to other nodes. The first has S with children T and A, and T has a child, N. N's child is S (the root), which means that it is not a tree. In the second diagram, S has two children, T and A, and T has two children, N, and A. Because one of T's children is its own sibling, this creates a cycle.
  • Not a tree (cycle from N back to S):
    โ†’โ†’โ†’โ†’S
    โ†‘  โ†™๏ธŽ โ†˜๏ธŽ 
    โ†‘ T   A
    โ†‘โ†™๏ธŽ
    N
    
  • Again: not a tree (the arrow from T to A causes a cycle):
        S
       โ†™๏ธŽ โ†˜๏ธŽ 
      Tโ†’โ†’โ†’A
    โ†™๏ธŽ 
    N
    

Slide 10

Building trees programatically

  • To build a tree, we need a new version of our Node. In this case, we want each Node to have a data value (like a linked list), but now we want to pointers, one to the left child, and one to the right child:
    struct TreeNode {
        string data;
        TreeNode* left;
        TreeNode* right; 
    };
    

    A binary tree node image. There is a data element, and two pointers going towards the left and towards the right

  • We don't have to limit ourselves to binary trees, by the way: we could have a ternary tree:
    struct TernaryTreeNode {
        string data;
        TernaryTreeNode* left;
        TernaryTreeNode* middle;
        TernaryTreeNode* right;
    }; 
    

    A ternary tree node image. There is a data element, and three pointers going towards the left, towards the middle, and towards the right

  • In fact, we could have any number of children (we may look at the Trie data structure, which can be built this way). This is an N-ary Tree:
    struct NAryTreeNode {
        string data;
        Vector<NAryTreeNode*> children;
    }; 
    

    An 'n-ary' tree node image. There is a data elemetn, and then a vector which holds pointers to a non-distinct number of children.


Slide 11

Traversing a Tree

  • Often, we will want to do something with each node in a tree. Like linked lists, we can traverse the tree, but it is more involved because of all the branching.
  • There are four different ways to traverse a binary tree:
    1. Pre-order
    2. In-order
    3. Post-order
    4. Level-order
  • Let's write some code for each traversal to see the differences. We'll look at two trees: A sentence tree, with the following words: `this` with children `is` and `written`. `is` has two children, `a` and `correctly`. `written` has a right child only, `sentence` (see the text below)
                this
             โ†™๏ธŽ        โ†˜๏ธŽ
           is         written
          โ†™๏ธŽ  โ†˜๏ธŽ             โ†˜๏ธŽ
         a    correctly  sentence
    
  • We'll also look at: A tree with 4 at the root and two children, 3 and 5. The three has two children, 1 and 2. The five has a right child, 6
               4
             โ†™๏ธŽ  โ†˜๏ธŽ
            2     5 
          โ†™๏ธŽ  โ†˜๏ธŽ     โ†˜๏ธŽ
         1    3      6 
    
  • By the way: trees do not have to be complete, like heaps. Any node can have 0, 1, or 2 children.

Slide 12

Pre-order traversal

  • The algorithm for a pre-order traversal is as follows:
    1. Do something
    2. Go left
    3. Go right
  • This is an inherently recursive algorithm
  • For our tests, let's make the "Do something" part print the data at a particular node.
  • Code:
    void preOrder(TreeNode* tree) {
        if(tree == nullptr) return;
        cout<< tree->data <<" ";
        preOrder(tree->left);
        preOrder(tree->right);
    }
    

Slide 13

In-order traversal

  • The algorithm for a in-order traversal is as follows:
    1. Go left
    2. Do something
    3. Go right
  • This is also an inherently recursive algorithm
  • Code:
    void inOrder(TreeNode* tree) {
        if(tree == nullptr) return;
        inOrder(tree->left);
        cout<< tree->data <<" ";
        inOrder(tree->right);
    }
    

Slide 14

Post-order traversal

  • The algorithm for a post-order traversal is as follows:
    1. Go left
    2. Go right
    3. Do something
  • This is also an inherently recursive algorithm
  • Code:
    void postOrder(TreeNode* tree) {
        if(tree == nullptr) return;
        postOrder(tree->left);
        postOrder(tree->right);
        cout<< tree->data <<" ";
    }
    

Slide 15

Level-order traversal

  • A level order traversal is different than all the others โ€“ we want to go one level at a time and print out each node, then go to the next level.
  • For this tree, it would be:
    • this is written a correctly sentence
  • Can we do this recursively!
  • But, we can use an algorithm you've seen before: breadth first search!
    • This is going to use a queue, just like the BFS you did for your maze solving assignment.
      1. Enqueue the root node
      2. While the queue is not empty
        • dequeue a node
        • do something with the node
        • enqueue the left child if it exists
        • enqueue the right child if it exists
  • Code:
    void levelOrder(TreeNode *tree) {
        Queue<TreeNode *>treeQueue;
        treeQueue.enqueue(tree);
        while (!treeQueue.isEmpty()) {
            TreeNode *node = treeQueue.dequeue();
            cout << node->value << " ";
      
           if (node->left != nullptr) {
                treeQueue.enqueue(node->left);
            }
            if (node->right != nullptr) {
                treeQueue.enqueue(node->right);
            }
        }
    }