Trees

CS 106B: Programming Abstractions

Fall 2025, Stanford University Computer Science Department

Lecturer: Chris Gregg, Head CA: Yasmine Alonso

A real binary tree

Announcements

  • Assignment 6 due Friday

Today’s Topics

  • Refresh on what pointer diagrams represent
  • Quick comment on references to pointers
  • Introduction to the tree data structure
    • You’ve already seen lots of tree structures
    • Tree terminology
    • How do we build trees programatically?
    • Writing code to traverse a tree
  • Code

Pointer Diagram Refresh

  • The following diagram was put together by a former CS106B student:
    • Pointers have values that are memory addresses.
    • All variables and objects have their own memory address where they are located
    • The next pointer is a memory address A diagram that shows that pointers' values are memory addresses, and that even the pointers have addresses themselves.

Trees

  • We have seen trees in this class before, in the form of decision trees: The decision tree for determining if `cart` is reducible. `cart` branches to `art`, `crt`, `cat`, and `car`, `art` branches to `rt`, `at`, and `ar`, etc.

  • Trees can describe hirearchies: A tree with `world` at the root, `China`, `USA` and `Honduras` as children. China has two children, `Hunan` and `Shanghai`. USA has two children, `California` and `Ohio`. Honduras has one child, `Tegucigalpa`. Hunan has one child, `Changsha`. California has two children, `SF` and `LA`. Ohio has one child, `Kent`

  • Trees can describe websites: A tree that describes a website, with `div` tags, `img` tags, `p` tags, `li` tags, `a` tags, etc. Each tag can have tags that are its children

  • Trees can desceribe programs (we often call them abstract syntax trees (the following is a figure in an academic paper written by a recent CS 106 student!)

    // Example student solution
    function run() {
      // move then loop
      move();
      // the condition is fixed
      while (notFinished()) {
        if (isPathClear()) {
          move();
        } else {
          turnLeft();
        }
        // redundant
        move();
      }
    }
    
    A tree that describes a program (the code above). The tree has `run` as its root, with `move` and `while` children. `while` has `if/else` and `move` as its children. `if/else` has `isPathClear`, `move`, and `turnLeft` as its children

Trees are inherently recursive

  • What is a tree in computer science?
    • A tree is a collection of nodes, which can be empty. If it is not empty, there is a root node, r, and zero or more non-empty subtrees, T1, T2, …, Tk, whose roots are connected by a directed edge from r. A tree of letters. See the text tree below
                       (root)
              -----------A--------------
              ↙︎   ↙︎  ↙︎   ↓    ↘︎       ↘︎
             B   C   D   E     F--      G
                    ↙︎   ↙︎ ↘︎   ↙︎ ↘︎ ↘︎      ↘︎
                   H   I   J K   L M       N
                          ↙︎ ↘︎
                         O   P
  • There are N nodes and N-1 edges in a tree.
  • Tree terminology. In the diagram above:
    • B is a child of A, and F is a child of A and a parent of K, L, M.
    • Nodes with no children are leaves. Nodes B, C, H, I, K, L, M, N, O, and P are all leaves.
    • Nodes with the same parent are siblings. I and J are siblings. K, L, and M are siblings. Nodes O and P are siblings. Nodes H and N do not have any siblings (only children).

More Tree Terminology

Most of the same tree as above, but smaller (see below)
                       (root)
                         A--------------
                         ↓    ↘︎        ↘︎
                         E     F         G
                        ↙︎ ↘︎   ↙︎ ↘︎ ↘︎       ↘︎
                       I   J K   L M       N
                          ↙︎ ↘︎
                         O   P
  • We can define a path from a parent to its children. The path from A to O is: A-E-J-O.
  • The path A-E-J-O has a length of three (the number of edges).
  • The depth of a node is the length from the root. The depth of node J is 2. The depth of the root is 0.
  • The height of a node is the longest path from the node to a leaf. The height of node F is 1. The height of all leaves is 0.
  • The height of a tree is the height of the root. The height of the above tree is 3.

One Parent, No Cycles

An image of a real parent holding a child's hand, and an image of a bicycle with a red 'no' sign through it
  • Tree nodes can only have one parent, and they cannot have cycles Two images -- one with a non-tree that has two nodes that share a child (nodes S and N both share a child, A). The second image is also a non-tree. Node S has two children, T and A, and node N has a single child, A. So, the A is (again) shared, and this is not a tree.

  • Not a tree (A has two parents):

    S   N
     ↘︎ ↙︎
      A
    
  • Again: not a tree (A has two parents):

      S    N
     ↙︎  ↘︎ ↙︎
    T    A
        ↙︎↘︎ 
       F   O
    
  • The following are not trees, either, because you can cycle through the tree. Nodes cannot have children that are on the same level or above on the tree: Two images -- both have nodes that cycle back to other nodes. The first has S with children T and A, and T has a child, N. N's child is S (the root), which means that it is not a tree. In the second diagram, S has two children, T and A, and T has two children, N, and A. Because one of T's children is its own sibling, this creates a cycle.

  • Not a tree (cycle from N back to S):

    →→→→S
    ↑  ↙︎ ↘︎ 
    ↑ T   A
    ↑↙︎
    N
    
  • Again: not a tree (the arrow from T to A causes a cycle):

        S
       ↙︎ ↘︎ 
      T→→→A
    ↙︎ 
    N
    

Building trees programatically

  • To build a tree, we need a new version of our Node. In this case, we want each Node to have a data value (like a linked list), but now we want to pointers, one to the left child, and one to the right child:
    struct TreeNode {
        string data;
        TreeNode* left;
        TreeNode* right; 
    };
    
    A binary tree node image. There is a data element, and two pointers going towards the left and towards the right
  • We don’t have to limit ourselves to binary trees, by the way: we could have a ternary tree:
    struct TernaryTreeNode {
        string data;
        TernaryTreeNode* left;
        TernaryTreeNode* middle;
        TernaryTreeNode* right;
    }; 
    
    A ternary tree node image. There is a data element, and three pointers going towards the left, towards the middle, and towards the right
  • In fact, we could have any number of children (we may look at the Trie data structure, which can be built this way). This is an N-ary Tree:
    struct NAryTreeNode {
        string data;
        Vector<NAryTreeNode*> children;
    }; 
    
    An 'n-ary' tree node image. There is a data elemetn, and then a vector which holds pointers to a non-distinct number of children.

Traversing a Tree

  • Often, we will want to do something with each node in a tree. Like linked lists, we can traverse the tree, but it is more involved because of all the branching.
  • There are four different ways to traverse a binary tree:
    1. Pre-order
    2. In-order
    3. Post-order
    4. Level-order
  • Let’s write some code for each traversal to see the differences. We’ll look at two trees: A sentence tree, with the following words: `this` with children `is` and `written`. `is` has two children, `a` and `correctly`. `written` has a right child only, `sentence` (see the text below)
                this
             ↙︎        ↘︎
           is         written
          ↙︎  ↘︎             ↘︎
         a    correctly  sentence
    
  • We’ll also look at: A tree with 4 at the root and two children, 3 and 5. The three has two children, 1 and 2. The five has a right child, 6
               4
             ↙︎  ↘︎
            2     5 
          ↙︎  ↘︎     ↘︎
         1    3      6 
    
  • By the way: trees do not have to be complete, like heaps. Any node can have 0, 1, or 2 children.

Pre-order traversal

  • The algorithm for a pre-order traversal is as follows:
    1. Do something
    2. Go left
    3. Go right
  • This is an inherently recursive algorithm
  • For our tests, let’s make the “Do something” part print the data at a particular node.
  • Code:
    void preOrder(TreeNode* tree) {
        if(tree == nullptr) return;
        cout<< tree->data <<" ";
        preOrder(tree->left);
        preOrder(tree->right);
    }
    

In-order traversal

  • The algorithm for a in-order traversal is as follows:
    1. Go left
    2. Do something
    3. Go right
  • This is also an inherently recursive algorithm
  • Code:
    void inOrder(TreeNode* tree) {
        if(tree == nullptr) return;
        inOrder(tree->left);
        cout<< tree->data <<" ";
        inOrder(tree->right);
    }
    

Post-order traversal

  • The algorithm for a post-order traversal is as follows:
    1. Go left
    2. Go right
    3. Do something
  • This is also an inherently recursive algorithm
  • Code:
    void postOrder(TreeNode* tree) {
        if(tree == nullptr) return;
        postOrder(tree->left);
        postOrder(tree->right);
        cout<< tree->data <<" ";
    }
    

Level-order traversal

  • A level order traversal is different than all the others – we want to go one level at a time and print out each node, then go to the next level.
  • For this tree, it would be:
    • this is written a correctly sentence
  • Can we do this recursively!
  • But, we can use an algorithm you’ve seen before: breadth first search!
    • This is going to use a queue, just like the BFS you did for your maze solving assignment.
      1. Enqueue the root node
      2. While the queue is not empty
        • dequeue a node
        • do something with the node
        • enqueue the left child if it exists
        • enqueue the right child if it exists
  • Code:
    void levelOrder(TreeNode *tree) {
        Queue<TreeNode*> queue;
        queue.enqueue(tree);
        while (!queue.isEmpty()) {
            TreeNode* element = queue.dequeue();
            if (element != nullptr) {
                cout << element->data << " ";
                queue.enqueue(element->left);
                queue.enqueue(element->right);
            }
        }
    }
    

The Trie data structure

  • A trie (pronounced “try”) is a tree-based data structure used for efficiently storing and retrieving strings, where each node represents a single character and paths from the root to nodes spell out strings.
  • Each node contains a collection of child pointers (typically mapped by character) and often a flag indicating whether that node marks the end of a valid word.
  • Tries excel at prefix-based operations like autocomplete and spell-checking because all strings with a common prefix share the same path from the root.
  • The time complexity for search, insert, and delete operations is $O(m)$, where $m$ is the length of the string, making tries very efficient for dictionary-like applications.

The following is a trie after adding the following strings: “apple”, “app”, “bat”, “ball”, “cats”: