Trees

CS 106B: Programming Abstractions

Spring 2021, Stanford University Computer Science Department

Lecturer: Chris Gregg, Head CA: Chase Davis

A real binary tree


Slide 2

Announcements

  • Assignment 5 is due tomorrow (late extension until Sunday), and Assignment 6 is due next Wednesday
    • We will post Assignment 6 today. It is a scaled-back assignment, so should be doable by Wednesday
  • You should start working on your personal project
    • The personal project is your final assessment!

Slide 3

Today's Topics

  • Refresh on what pointer diagrams represent
  • Quick comment on doubly-linked lists
  • Introduction to the tree data structure
    • You've already seen lots of tree structures
    • Tree terminology
    • How do we build trees programatically?
    • Writing code to traverse a tree

Slide 4

Pointer Diagram Refresh

  • The following diagram was put together by a student last Spring:
    • Pointers have values that are memory addresses.
    • All variables and objects have their own memory address where they are located
    • The next pointer is a memory address A diagram that shows that pointers' values are memory addresses, and that even the pointers have addresses themselves.

Slide 5

Doubly-linked lists

  • Last time, we discussed linked lists. Specifically, we discussed singly linked lists, which have a single next pointer in each node, and are strictly one-way.
  • We also can create a doubly linked list, which has pointers in both directions:
    struct ListNode {
        int data;
        ListNode* prev;
        ListNode* next;
    };
    
  • We will keep a _front and a _back pointer in the class, too:
    class DoublyLinkedList {
         ...
    private:
         ListNode* _front;
         ListNode* _back;
    };
    
  • We can draw the nodes sideways for a bit more intuitive pictures: ``` A diagram showing a doubly linked list. Each node has a prev pointer on the left, and a next pointer on the right. The front pointer points to the first node, which has nullptr as its prev, and next points to the second node. The second node has its prev pointer pointing back to the first node, and its next pointer pointing to the third node. The third node has its prev pointer pointing to the second node, and its next pointer pointing to nullptr

  • We can traverse in either direction with a doubly linked list:
    ListNode* curr = _front; // print in forward order 
    while (curr != nullptr) {
        cout << curr->data;
        curr = curr->next; 
    }
    
    ListNode* curr = _back; 
    while (curr != nullptr) {
        // print in reverse order
        cout << curr->data;
        curr = curr->prev;
    }
    

Slide 6

Adding nodes to a doubly linked list

  • If we want to add a new node to a doubly-linked list, we don't have to stop on the previous node, and we could find the node in either direction.
  • However, we have to change four pointers, which can make coding up a doubly lnked list tricky. A diagram showing a doubly linked list with three nodes and the `curr` pointer pointing to the middle node. Each node has a prev pointer on the left, and a next pointer on the right. The front pointer points to the first node, which has nullptr as its prev, and next points to the second node. The second node has its prev pointer pointing back to the first node, and its next pointer pointing to the third node. The third node has its prev pointer pointing to the second node, and its next pointer pointing to nullptr
  • Let's say we want to add the following newNode before the 17, and we have curr on the 17: A node with 3 as the data, and nothing pointed to by prev or next (yet)
  • We would have to change four different pointers. Here's how we would make those changes (and order is important!):
    curr->prev->next = newNode;
    newNode->prev = curr->prev;
    curr->prev = newNode;
    newNode->next = curr;
    

    Removing nodes from a doubly linked list

  • If we want to remove a new node to a doubly-linked list, we don't have to stop on the previous node, and we could find the node in either direction.
  • In this case, we only have to make changes to two pointers, the previous node's next, and the next node's prev.
  • Let's say we want to remove the 17 from the list below, and we've stopped curr on that node: A diagram showing a doubly linked list with three nodes and the `curr` pointer pointing to the middle node. Each node has a prev pointer on the left, and a next pointer on the right. The front pointer points to the first node, which has nullptr as its prev, and next points to the second node. The second node has its prev pointer pointing back to the first node, and its next pointer pointing to the third node. The third node has its prev pointer pointing to the second node, and its next pointer pointing to nullptr
  • Here are the changes we have to make:
    curr->prev->next = curr->next;
    curr->next->prev = curr->prev; 
    delete curr;
    

Slide 7

Trees

  • We have seen trees in this class before, in the form of decision trees: The decision tree for determining if `cart` is reducible. `cart` branches to `art`, `crt`, `cat`, and `car`, `art` branches to `rt`, `at`, and `ar`, etc.
  • Trees can describe hirearchies: A tree with `world` at the root, `China`, `USA` and `Honduras` as children. China has two children, `Hunan` and `Shanghai`. USA has two children, `California` and `Ohio`. Honduras has one child, `Tegucigalpa`. Hunan has one child, `Changsha`. California has two children, `SF` and `LA`. Ohio has one child, `Kent`
  • Trees can describe websites: A tree that describes a website, with `div` tags, `img` tags, `p` tags, `li` tags, `a` tags, etc. Each tag can have tags that are its children
  • Trees can desceribe programs (we often call them abstract syntax trees (the following is a figure in an academic paper written by a recent CS 106 student!)
    // Example student solution
    function run() {
      // move then loop
      move();
      // the condition is fixed
      while (notFinished()) {
        if (isPathClear()) {
          move();
        } else {
          turnLeft();
        }
        // redundant
        move();
      }
    }
    

    A tree that describes a program (the code above). The tree has `run` as its root, with `move` and `while` children. `while` has `if/else` and `move` as its children. `if/else` has `isPathClear`, `move`, and `turnLeft` as its children


Slide 8

Trees are inherently recursive

  • What is a tree in computer science?
    • A tree is a collection of nodes, which can be empty. If it is not empty, there is a root node, r, and zero or more non-empty subtrees, T1, T2, โ€ฆ, Tk, whose roots are connected by a directed edge from r. A tree of letters. See the text tree below
                       (root)
              -----------A--------------
              โ†™๏ธŽ   โ†™๏ธŽ  โ†™๏ธŽ   โ†“    โ†˜๏ธŽ       โ†˜๏ธŽ
             B   C   D   E     F--      G
                    โ†™๏ธŽ   โ†™๏ธŽ โ†˜๏ธŽ   โ†™๏ธŽ โ†˜๏ธŽ โ†˜๏ธŽ      โ†˜๏ธŽ
                   H   I   J K   L M       N
                          โ†™๏ธŽ โ†˜๏ธŽ
                         O   P
  • There are N nodes and N-1 edges in a tree.
  • Tree terminology. In the diagram above:
    • B is a child of A, and F is a child of A and a parent of K, L, M.
    • Nodes with no children are leaves. Nodes B, C, H, I, K, L, M, N, O, and P are all leaves.
    • Nodes with the same parent are siblings. I and J are siblings. K, L, and M are siblings. Nodes O and P are siblings. Nodes H and N do not have any siblings (only children).

Slide 9

More Tree Terminology

Most of the same tree as above, but smaller (see below)

                       (root)
                         A--------------
                         โ†“    โ†˜๏ธŽ        โ†˜๏ธŽ
                         E     F         G
                        โ†™๏ธŽ โ†˜๏ธŽ   โ†™๏ธŽ โ†˜๏ธŽ โ†˜๏ธŽ       โ†˜๏ธŽ
                       I   J K   L M       N
                          โ†™๏ธŽ โ†˜๏ธŽ
                         O   P
  • We can define a path from a parent to its children. The path from A to O is: A-E-J-O.
  • The path A-E-J-O has a length of three (the number of edges).
  • The depth of a node is the length from the root. The depth of node J is 2. The depth of the root is 0.
  • The height of a node is the longest path from the node to a leaf. The height of node F is 1. The height of all leaves is 0.
  • The height of a tree is the height of the root. The height of the above tree is 3.

Slide 10

One Parent, No Cycles

An image of a real parent holding a child's hand, and an image of a bicycle with a red 'no' sign through it

  • Tree nodes can only have one parent, and they cannot have cycles Two images -- one with a non-tree that has two nodes that share a child (nodes S and N both share a child, A). The second image is also a non-tree. Node S has two children, T and A, and node N has a single child, A. So, the A is (again) shared, and this is not a tree.
  • Not a tree (A has two parents):
    S   N
     โ†˜๏ธŽ โ†™๏ธŽ
      A
    
  • Again: not a tree (A has two parents):
      S    N
     โ†™๏ธŽ  โ†˜๏ธŽ โ†™๏ธŽ
    T    A
        โ†™๏ธŽโ†˜๏ธŽ 
       F   O
    
  • The following are not trees, either, because you can cycle through the tree. Nodes cannot have children that are on the same level or above on the tree: Two images -- both have nodes that cycle back to other nodes. The first has S with children T and A, and T has a child, N. N's child is S (the root), which means that it is not a tree. In the second diagram, S has two children, T and A, and T has two children, N, and A. Because one of T's children is its own sibling, this creates a cycle.
  • Not a tree (cycle from N back to S):
    โ†’โ†’โ†’โ†’S
    โ†‘  โ†™๏ธŽ โ†˜๏ธŽ 
    โ†‘ T   A
    โ†‘โ†™๏ธŽ
    N
    
  • Again: not a tree (the arrow from T to A causes a cycle):
        S
       โ†™๏ธŽ โ†˜๏ธŽ 
      Tโ†’โ†’โ†’A
    โ†™๏ธŽ 
    N
    

Slide 11

Building trees programatically

  • To build a tree, we need a new version of our Node. In this case, we want each Node to have a data value (like a linked list), but now we want to pointers, one to the left child, and one to the right child:
    struct TreeNode {
        string data;
        TreeNode* left;
        TreeNode* right; 
    };
    

    A binary tree node image. There is a data element, and two pointers going towards the left and towards the right

  • We don't have to limit ourselves to binary trees, by the way: we could have a ternary tree:
    struct TernaryTreeNode {
        string data;
        TernaryTreeNode* left;
        TernaryTreeNode* middle;
        TernaryTreeNode* right;
    }; 
    

    A ternary tree node image. There is a data element, and three pointers going towards the left, towards the middle, and towards the right

  • In fact, we could have any number of children (we may look at the Trie data structure, which can be built this way). This is an N-ary Tree:
    struct NAryTreeNode {
        string data;
        Vector<NAryTreeNode*> children;
    }; 
    

    An 'n-ary' tree node image. There is a data elemetn, and then a vector which holds pointers to a non-distinct number of children.


Slide 12

Traversing a Tree

  • Often, we will want to do something with each node in a tree. Like linked lists, we can traverse the tree, but it is more involved because of all the branching.
  • There are four different ways to traverse a binary tree:
    1. Pre-order
    2. In-order
    3. Post-order
    4. Level-order
  • Let's write some code for each traversal to see the differences. We'll look at two trees: A sentence tree, with the following words: `this` with children `is` and `written`. `is` has two children, `a` and `correctly`. `written` has a right child only, `sentence` (see the text below)
                this
             โ†™๏ธŽ        โ†˜๏ธŽ
           is         written
          โ†™๏ธŽ  โ†˜๏ธŽ             โ†˜๏ธŽ
         a    correctly  sentence
    
  • We'll also look at: A tree with 4 at the root and two children, 3 and 5. The three has two children, 1 and 2. The five has a right child, 6
               4
             โ†™๏ธŽ  โ†˜๏ธŽ
            2     5 
          โ†™๏ธŽ  โ†˜๏ธŽ     โ†˜๏ธŽ
         1    3      6 
    
  • By the way: trees do not have to be complete, like heaps. Any node can have 0, 1, or 2 children.

Slide 13

Pre-order traversal

  • The algorithm for a pre-order traversal is as follows:
    1. Do something
    2. Go left
    3. Go right
  • This is an inherently recursive algorithm
  • For our tests, let's make the "Do something" part print the data at a particular node.
  • Code:
    void preOrder(TreeNode* tree) {
        if(tree == nullptr) return;
        cout<< tree->data <<" ";
        preOrder(tree->left);
        preOrder(tree->right);
    }
    

Slide 14

In-order traversal

  • The algorithm for a in-order traversal is as follows:
    1. Go left
    2. Do something
    3. Go right
  • This is also an inherently recursive algorithm
  • Code:
    void inOrder(TreeNode* tree) {
        if(tree == nullptr) return;
        inOrder(tree->left);
        cout<< tree->data <<" ";
        inOrder(tree->right);
    }
    

Slide 15

Post-order traversal

  • The algorithm for a post-order traversal is as follows:
    1. Go left
    2. Go right
    3. Do something
  • This is also an inherently recursive algorithm
  • Code:
    void postOrder(TreeNode* tree) {
        if(tree == nullptr) return;
        postOrder(tree->left);
        postOrder(tree->right);
        cout<< tree->data <<" ";
    }
    

Slide 16

Level-order traversal

  • A level order traversal is different than all the others โ€“ we want to go one level at a time and print out each node, then go to the next level.
  • For this tree, it would be:
    • this is written a correctly sentence
  • Can we do this recursively!
  • But, we can use an algorithm you've seen before: breadth first search!
    • This is going to use a queue, just like the BFS you did for your maze solving assignment.
      1. Enqueue the root node
      2. While the queue is not empty
        • dequeue a node
        • do something with the node
        • enqueue the left child if it exists
        • enqueue the right child if it exists
  • Code:
    void levelOrder(TreeNode *tree) {
        Queue<TreeNode *>treeQueue;
        treeQueue.enqueue(tree);
        while (!treeQueue.isEmpty()) {
            TreeNode *node = treeQueue.dequeue();
            cout << node->value << " ";
      
           if (node->left != nullptr) {
                treeQueue.enqueue(node->left);
            }
            if (node->right != nullptr) {
                treeQueue.enqueue(node->right);
            }
        }
    }