Introduction to Linked Lists

CS 106B: Programming Abstractions

Spring 2023, Stanford University Computer Science Department

Lecturer: Chris Gregg, Head CA: Neel Kishnani

A long line of people


Slide 2

Announcements

  • Assignment 5 is due Friday

Slide 3

Today's Topics

  • Structs and pointers
  • Can you architect a queue?
    • Let's do it with a Vector
    • Let's try it with links
  • Introduction to the Linked List data structure
    • The Node struct
  • Lord of the Linked Lists
  • How is the Stack implemented with a linked list?
  • Linked List Queue implementation

Slide 4

Using pointers with class and struct objects

  • Let's define a class (or it could be a struct) as follows:
    class Date {
    public:
        int day;
        int month;
        int year;
        string dayOfWeek; 
        int daysInMonth();
    };
    
  • Pointers can point to a class or struct just like they can to any other variable:
    Date* dPtr = new Date; // we now have a pointer to a Date object
    
  • If we want to access the class variables and functions, we could do this:
    (*dPtr).day = 7;
    int numDays = (*dPtr).daysInMonth();
    cout << (*dPtr).month << endl;
    
  • But, this notation is cumbersome, and the parentheses are necessary becasue the "dot" has a higher precedence than the *.
  • So, there is a different, more intuitive syntax, called the arrow syntax: ->
    dPtr->day = 7;
    int numDays = dPtr->daysInMonth();
    cout << dPtr->month << endl;
    
  • Arrow notation, x->var is equivalent to (*x).var, and we will use it exclusively when using classes and structs.

Slide 5

Can you architect a queue?

  • Let's investigate building a queue from a Vector
  • The following class definition for an integer queue would suffice for the enqueue and dequeue functions, with a Vector holding the data.
    class QueueInt {          // in QueueInt.h
    public:
        QueueInt ();          // constructor
      
        void enqueue(int value); // append a value 
        int dequeue();           // return the first-in value
      
    private:
        Vector<int> data;     // member variables
    };
    
  • Let's assume that we have the back of the queue on the left, and the front of the queue on the right. If we enqueue eight values, the vector would look like this:
    back          front
    ↓             ↓ 
    1 2 3 4 5 6 7 8
    
  • The dequeue operation is relatively straightforward: we just remove and return the element from the end, and the front is now at the previous index:
    back        front
    ↓           ↓ 
    1 2 3 4 5 6 7 _
    
  • But, enqueue is more involved. Because the back is on the left of the vector, we have to move all the elements over one, one at at time before placing our value into the vector. If we enqueue(9) on the vector, this is what happens:
    1 2 3 4 5 6 7 7
    1 2 3 4 5 6 6 7
    1 2 3 4 5 5 6 7
    1 2 3 4 4 5 6 7
    1 2 3 3 4 5 6 7
    1 2 2 3 4 5 6 7
    1 1 2 3 4 5 6 7
    9 1 2 3 4 5 6 7
  • If we look at the complexity of enqueue() and dequeue(), we have:
    • enqueue: O(n)
    • dequeue: O(1)

An image that says 'WE CAN DO BETTER'


Slide 6

And Now for Something Completely Different

  • Let's use pointers to make things a bit more interesting.
  • Let's say that we have an 8 that we want to put into a queue. We can make a variable that holds an 8, and let's say we do this with dynamic memory.

    An image of an 8 sitting on an otherwise blank canvas

  • Now, let's say we have a data pointer that points to an integer. In this case, it will point to the 8. We now have a queue! The front of the queue is the 8, as is the back of the queue. An image of an 'int* data' pointer that points to the 8
  • Now, what if we want to enqueue(7) into the queue? Well, we can create a 7 just like we created the 8. A 7 in a box on its own
  • What if we simply put a 7 somewhere in memory, and then changed data to point to the 7, and made the 7 (somewhow) point to the 8? (The somehow, by the way, is a pointer that is associated with the 7 called its next pointer, in a struct – we will get to that!).

    An image of an 'int* data' pointer that now points to a 7, and the 7 points to the 8

  • Now we have enqueued the 7 into the back of the queue! Our data points to the 7, which we have "pointed" to the 8.
  • Let's enqueue(6), and go through the process a bit slower. A 6 has now appeared on its own in the diagram
  • Now we have a 6 in memory, but not yet connected to anything. We must get data to point to the 6, and then have the 6 point to the 7. We actually first point the 6 to the 7. It turns out that if we didn't do that, It turns out that we have to do it in this order becuase if we changed data first, we would lose access to the 7, becuase data was the only thing pointing to the 7 (we'll cover this soon!)

    Now, the 6 points to the 7 (and so does data)

  • Remember, two pointers can point to the same value (in this case, both data and the 6 pointer).
  • Next, we can change data to point to the 6:

    Now, data points to the 6, which points to the 7, which points to the 8

  • Oh, look, we've enqueued the 6!
  • There are two bits of extra information that we need to make this list a proper queue
    • We need to figure out what 8 points to. In this case, we will point 8 to nullptr, indicating that it is the end of the list. If we check 8's pointer and find that it is nullptr, then we are at the end of the list (or, in this case, the front of the queue). In the diagram, we just represent this with a slash through 8's pointer region.
    • We need to designate the front as such, and the back as such: Now we have data represented as 'back' and 'front' as another pointer to the front of the queue. It turns out that this isn't the best way to build a queue with a linked list!
  • It turns out that this is the wrong way to build a queue with a linked list! We'll see at the end of the lecture a better way to do it by changing things just a bit.

Slide 7

Linked Lists

  • What we've just examined is the beginning of a linked list
  • A linked list is a chain of nodes
  • Each node contains two pieces of information:
    • Some piece of data that is stored in the sequence
    • A link to the next node in the list
  • We can traverse the list by starting at the first node and repeatedly following its link. The list with data pointing to 6 and 6 pointing to 7, with 7's pointer nul-terminated
  • Each element is stored separately from the rest.
  • The elements are then chained together into a sequence. A linked list with a pointer to the first element, 1, and with 1 pointing to 2, and 2 pointing to 3. 3 is nul-terminated
  • To add a node at the end, we chain the last element to the end, by first finding the end (traversing the list), and then adding it: A linked list with a pointer to the first element, 1, and with 1 pointing to 2, and 2 pointing to 3, and now 3 pointing to 4. 4 is now nul-terminated

Slide 8

Adding a node in the middle of a linked list

  • To insert a node into the middle of a linked list, we first need to find the location where we are adding it – this can be a slow process because we have to travers from the beginning!
  • Let's say we want to insert between 2 and 3. First, we need to find 2 by traversing from 1, and then we split the list and insert our new element: A linked list with a pointer to the first element, 1, and with 1 pointing to 2, and 2 pointing to a new element, 100. 100 now points to 3, and 3 points to 4. 4 is nul-terminated

Slide 9

Removing a node in the middle of a linked list

  • To remove a node into the middle of a linked list, we first need to find the location where we are adding it, as with insert.
  • Let's say we want to remove 3. First, we need to find 100 by traversing from 1, then 2, and then to 100. Then, we rewire the list so 100 points to 4: We have removed the 3 from the last list by finding the 100, and then linking it to the 4.

Slide 10

Why linked lists?

  • We can efficiently splice new elements into the list or remove existing elements anywhere in the list
  • We never have to do a massive copy step
  • Linked lists have many tradeoffs, and are not often the best data structure!

Slide 11

Linked lists in C++

  • Let's take a look at building a linked list of strings
  • In C++, we represent the node in the linked list as a struct, with two fields, a data field, and a next field:
    struct Node {
        string data;
        /* ? */ next;
    
  • But, what is the type of next? It must point to another Node, so…it is a Node* type:
    struct Node {
        string data;
        Node* next;
    };
    
  • The structure is defined recursively! The compiler can handle the fact that in the definition of the Node there is a Node*, becuase it knows it is simply a pointer. We could not recursively define the Node with an actual Node object inside the struct, as that would be impossible to realize.

Slide 12

Always!

Always draw pictures when you are building linked lists! This is critical to getting the hang of it.


Slide 13

Lord of the Linked Lists

  • In a scene that was brilliantly captured in Peter Jackson’s film adaptation of The Return of the King, Rohan is alerted to the danger to Gondor by a succession of signal fires moving from mountain top to mountain top. This scene is a perfect illustration of the idea of message passing in a linked list. An image of the mountain fires in the Lord of the Rings -- Minas Tirith started first, then Amon Din, then Eilenach, then Nardol, then Erelas, then Min-Rimmon, then Calenhad, then Halifiren, and finally the signal reached Rohan

Slide 14

Return of the King

An image of Aragorn from Lord of the Rings, Return of the King


Slide 15

Lord of the Linked Lists

  • We are going to make a San Francisco-based linked list, based on CalTrain stops. Step 1, make the linked list: An image of the CalTrain stops: San Francisco, Bayshore, Millbrae, Redwood City, Menlo Park, Palo Alto, Mountain View, Santa Clara, and San Jose
  • Step 2: Light the fires!
    struct Tower {
       string name;  /* The name of this tower    */
       Tower* next;  /* Pointer to the next tower */
    };
    
  • Add the first tower:
    // add the first tower
    Tower * head = new Tower; 
    head->name = "San Jose";
    head->next = nullptr;
    
  • The main function:
    // main
    Tower * head = new Tower; 
    head->name = "San Jose";
    head->next = nullptr;
      
    head = createTower("Santa Clara", head);
    head = createTower("Mountain View", head);
    head = createTower("Palo Alto", head);
    head = createTower("Menlo Park", head);
    head = createTower("Redwood City", head);
    head = createTower("Millbrae", head);
    head = createTower("Bayshore", head);
    head = createTower("San Francisco", head);
    
  • The createTower function:
    Tower* createTower(string name, Tower *next) {
       Tower* tp = new Tower;
       tp->name = name;
       tp->next = next;
       return tp;
    }
    
  • The signal function (which is recursive!):
    void signal(Tower* start) {
       if (start != nullptr) {
          cout << "Lighting " << start->name << endl;
          signal(start->next);
       }
    }
    
  • We call the function with the head:
    signal(head);
    
  • By the way: the head pointer is not a Tower! it is only a pointer to a Tower, and the first Tower is San Francisco. The qt logo

Slide 16

How is the Stack implemented with a linked list?

The qt logo

  • The Node definition we've seen before:
    struct Node {
        int data;
        Node *next;
    };
    

Slide 17

Stack

  • Let's assume we have the following stack already, with 8 at the top, and 9 below 8. We then want to push(7) onto the stack: A Node* head pointer pointing to an 8, which in turn points to a 9
  • Here is our goal: The image from above with 7 inserted before the 8, so that head points to 7 and 7 points to 8
  • Our first attempt at push(7): The image of head->8->9 with an additional Node* temp pointing to 7 on its own
    Node* temp = new Node;
    temp->data = 7;
    
  • If we try to rewire by changing head first…
    head = temp;
    

    We now have the situation where head points to the 7, and temp points to the 7, but nothing is pointing to the 8, which means that we've lost access to it.

  • In other words: a linked list's elements must be pointed to, because we need to keep track of them. In our attempt above, we've lost access to the 8, becuase the only thing pointing at the 8 was head. If we reassign head to point to another object (the 7 in this case), we've broken the chain and lost 8. This is a common bug!
  • Let's try again: A Node* head pointer pointing to an 8, which in turn points to a 9
  • Here is our goal: The image from above with 7 inserted before the 8, so that head points to 7 and 7 points to 8
  • Our next attempt at push(7): The image of head->8->9 with an additional Node* temp pointing to 7 on its own
    Node* temp = new Node;
    temp->data = 7;
    
  • Now we rewire 7's next pointer first:
    temp->next = head;
    

    We now have both 7's next and head pointing to the 8, so we won't lose contact with the 8 when we re-assign head to point to 7

  • You might be thinking, wait – why didn't the arrow go from the 7 to the head? – this is a common misconception of what is happening!
    • Remember, head is not a Node. head is a pointer to a Node. Notice that head does not have a next – it's not an object, just a pointer.
    • The statement temp->next = head; says, "give 7's next pointer the same data as head", which is what we want to do.
  • Now, we are able to reassign head to point to the 7, and we will have a correct linked list with 7 pushed onto the top: We have now reassigned head to point to the 7, and we have a correct linked list with head pointing to 7, 7's next pointing to the 8, and 8's next pointing to 9
  • Our temp pointer actually disappears when the push operation is complete (it goes out of scope), so we are left with the following: The temp pointer has gone away. We have now reassigned head to point to the 7, and we have a correct linked list with head pointing to 7, 7's next pointing to the 8, and 8's next pointing to 9

Slide 18

Stack pop()

  • To pop a data from our stack, we start like this:
    int toReturn = head->data;
    

    The temp pointer has gone away. We have now reassigned head to point to the 7, and we have a correct linked list with head pointing to 7, 7's next pointing to the 8, and 8's next pointing to 9

  • What if we tried this to reassign head?
    head = head->next;
    

    We have lost contact with the 7, because nothing points there. Our linked list is actually fine, but we have a memory leak with the 7, which we cannot delete

  • This is a bug! Our linked list is fine, but we have a memory leak! We have left the 7 in memory and not returned it to the operating system with delete.
  • Instead, we need to do this:
    Node* temp = head;
    

    We create a temp pointer to point to the 7 so we can reassign the head without losing the 7.

  • Now we can reassign head:
    head = head->next;
    

    Now head points to 8.

  • Because we still have access to the 7, we can return the memory to the operating system:
    delete temp;
    

    We have deleted 7. Head points to 8.

  • Finally, we return the data, and temp goes out of scope:
    return toReturn;
    

    We returned data, so we now have a linked list without the 7: head points to 8 and 8's next points to 9


Slide 19

Stack Code

  • Header, intStack.h:
#pragma once

class IntStack {
public:
    IntStack(); // constructor
    ~IntStack();

    bool isEmpty();
    
    void push(int value);
    int top();
    int pop();

private:
    struct Node {
      int data;
      Node* next;
    };
    
    Node* head;
};
  • Class code, intStack.cpp
#include "intStack.h"

IntStack::IntStack()
{
    head = nullptr;
}

IntStack::~IntStack()
{
    while (head != nullptr) {
        Node *temp = head;
        head = head->next;
        delete temp;
    }
}

void IntStack::push(int value)
{
    Node* node = new Node;

    node->data = value;
    node->next = head;

    head = node;
}

int IntStack::pop()
{
    if (isEmpty()) {
        throw "Error! Trying to pop from empty stack!";
    }
   
    int toReturn = head->data;
 
    Node* temp = head;
    head = head->next;

    delete temp;
    return toReturn;
}

Slide 20

A Better Linked List Queue

  • At the beginning of this topic, we discussed how to build a queue with a linked list. We had the back as the first element in the list, and the front as the second element. This led to O(1) behavior for enqueue, and O(n) behavior for dequeue. We can do better!
  • If we hold a pointer to the front and to the back, and we make the front the first element in the list, and the back the last element in the list, we can successfully make a queue with O(1) behavior for both enqueue and dequeue.
  • Here is the code, then we'll walk through some examples:
  • intQueue.h
#pragma once

class IntQueue {
public:
    IntQueue(); // constructor
    ~IntQueue();

    bool isEmpty();

    void enqueue(int value);
    int front();
    int dequeue();

private:
    struct Node {
        int data;
        Node* next;
    };

    Node* _front;
    Node* _back;
};
  • intQueue.cpp:
#include "intQueue.h"

IntQueue::IntQueue()
{
    _front = nullptr;
    _back = nullptr;
}

IntQueue::~IntQueue()
{
    while (_front != nullptr) {
        Node *temp = _front;
        _front = _front->next;
        delete temp;
    }
}

void IntQueue::enqueue(int value)
{
    Node* node = new Node;

    node->data = value;
    node->next = nullptr;

    if (_back == nullptr) { // enqueue on empty queue
        _front = node;
        _back = node;
    } else {
        _back->next = node;
        _back = node;
    }
}

int IntQueue::dequeue()
{
    if (isEmpty()) {
        throw "Error! Trying to dequeue from empty queue!";
    }

    int toReturn = _front->data;

    Node* temp = _front;
    _front = _front->next;

    if (_front == nullptr) {
        _back = nullptr; // empty queue
    }

    delete temp;
    return toReturn;
}

bool IntQueue::isEmpty()
{
    return _front == nullptr;
}

int IntQueue::front()
{
    return _front->data;
}