Lecture 6: The Set and Map Classes

CS 106B: Programming Abstractions

Autumn 2020, Stanford University Computer Science Department

Lecturers: Chris Gregg and Julie Zelenski

An image of a set, with a box around many circles, each with a letter inside, and an image of a map, with many circles, each with an individual arrow to another circle.


Slide 2

Announcements

  • Assignment 1 is due tonight, end-of-day PDT. Assignment 2 coming soon!
  • The recorded section video for this week has been posted on Canvas for anyone that wants extra review on the last week and a half of lecture content.

Slide 3

Sets and Maps

  • Today we are going to discuss two new collections: sets and maps.
  • A set is a collection of elements with no duplicates: An image of a set, with a circle surrounding many words, none of which are duplicates
  • A map is a collection of key / value pairs, and the key is useed to find the value. In python, we call this collection a dict. An image of a map, with two circles, the first holding the keys 'Chris', 'Jenny', 'Julie', and 'Nick', with arrows going to a second circle with phone numbers for each name.

Slide 4

Sets

  • A set is a collection of elements with no duplicates. Sets have (at least) the following operations (and they are fast):
    • add(value): adds a value to a set, and ignores if the set already contains the value
    • contains(value): returns true if the set contains the value, false otherwise.
    • remove(value): removes the value from the set. Does nothing if the value is not in the set.
    • size(): returns the number of elements in the set
    • isEmpty(): returns true if the set is empty, false otherwise.
    • Sets have other useful functions, and you should check the Set documentation to see them.
  • Sets do not have indexes!

A set with many words, including "the", "if", "to", "by", but not "be". The image has set.contains("to"), which returns "true", and set.contains("be"), which returns false


Slide 5

Sets: simple example

Set<string> friends;
friends.add("nick");
friends.add("chris");
friends.add("julie");
cout << boolalpha << friends.contains("voldemort") << 
     << noboolalpha << endl;
for(string person : friends) {
    cout << person << endl;
}

Output:

false
chris
julie
nick
  • Notice – the output from the for loop is alphabetical! Sets keep their values sorted.

Slide 6

Looping over a Set

for(type currElem : set) {
    // process elements one at a time
}
  • You can't use an index-based for loop, becuase Sets do not have indexes!
    for(int i=0; i < set.size(); i++) {
      // does not work, no index!
      cout << set[i];
    }
    

    Homer Simpson setting a bowl of cornflakes on fire


Slide 7

Types of Sets

  • There are actually two types of sets in the Stanford library:
    • Set
      • Iteration over the elements is in sorted order
      • Really fast access times! O(log n) per retrieval – we will learn about this next week!
      • Sets are implemented using a "binary search tree"
    • HashSet
      • Iteration over the elements is not in a useful order (it is jumbled)
      • Really, ridiculously fast!
      • O(1) per retrieval (again, we will learn what this means next time!)

Slide 8

Set Operands

  • Sets can be compared, combined, etc.
    • s1 == s2
      true if the sets contain exactly the same elements
    • s1 != s2
      true if the sets don't contain the exact same elements
    • s1 + s2
      returns the union of s1 and s2 (i.e., all elements in both)
    • s1 * s2
      returns the intersection of s1 and s2 (i.e., only the elements in both sets)
    • s1 - s2
      returns the difference of s1 and s2 (the elements in s1 but not in s2)

Slide 9

Counting Unique Words

  • Let's write a program to count the unique words in a file The Qt Creator logo

Slide 10

Maps

  • A map is a collection of pairs (k, v), sometimes called key/value pairs, where v can be found quickly if you know k.
  • Other terms you may hear for a map are dictionary or associative array.
  • A map is a generalization of an array, where the "indexes" don't need to be integers: Key Value pairs, with keys on the left ("Chris", "Jenny", "Julie", and "Nick") and their corresponding values on the right ("866-2233", "867-5309", "685-6232", and "488-0312" respectively

Slide 11

Using Maps

  • A map allows you to get from one half of a pair to the other.
    • E.g., to store an association from "Jenny" to "867-5309":
      //  key         value
      m["Jenny"] = "867-5309"; // or:
      m.put("Jenny", "867-5309");
      
    • To get Jenny's number:
      string ph = m["Jenny"] // or
      string ph = m.get("Jenny")
      cout << ph << endl;
      

      Output:

      867-5309
      

Slide 12

Maps are Everywhere

  • Wikipedia: the key is the title, the value is the article:

Two wikipedia articles, "Yosemite National Park" and "Mariana Trench"


Slide 13

Creating Maps

  • A Stanford Map requires two parameters: one for keys, one for values:
    // maps from string keys to integer values
    Map<string, int> votes;
    
    // maps from string keys to Vector<string> values
    Map<string, Vector<string>> friendMap; 
    

Slide 14

Map Functions

  • The following functions are part of the Map class:
    • m.clear() : removes all key/value pairs from the map
    • m.containsKey(key) : returns true if the map contains a value for the given key
    • m[key]
      m.get(key) : returns the value mapped to the given key; m[key] only: if key is not in the map, adds it with the default value (e.g., 0 or ""); m.get(key) only: if key is not in the map, returns the default value for the value type, but does not add it to the map.
    • m.isEmpty() : returns true if the map contains no key/value pairs (size 0)
    • m.keys() : returns a Vector copy of all keys in the map
    • m[key] = value
      m.put(key, value) : adds a mapping from the given key to the given value; if the key already exists, replaces its value with the given one
    • m.remove(key) : removes any existing mapping for the given key (ignored if the key doesn't exist in the map)
    • m.size() : returns the number of key/value pairs in the map
    • m.toString() : returns a string such as "{a:90, d:60, c:70}"
    • m.values() : returns a Vector copy of all the values in the map

Slide 15

Map Example

Map<string, string> wiki;

// adds name / text pair to dataset
wiki.put("Neopalpa donaldtrumpi", articleHTML);

// returns corresponding articleHTML
cout << wiki.get("Yosemite National Park");

// removes the article
wiki.remove("Britain in the E.U.");

A Wikipedia article showing a moth named after Donald Trump, called "Neopalpa Donaldtrumpi", because the moth has a shock of yellow hair

A Wikipedia article about Yosemite National Park

A Wikipedia article about the removal of Great Britain from the U.K.


Slide 16

Types of Maps

  • There are actually two types of maps in the Stanford library:
    • Map
      • Iteration over the elements is in sorted order
      • Really fast access times! O(log n) per retrieval – we will learn about this next week!
      • Maps are implemented using a "binary search tree"
    • HashMap
      • Iteration over the elements is not in a useful order (it is jumbled)
      • Really, ridiculously fast!
      • O(1) per retrieval (again, we will learn what this means next time!)

Slide 17

Map Example: Tallying Votes

// tally votes:
// (M)ilk, (S)tokes, (R)ogers
string allVotes = "MMMRMSSMSSMMMMMRRMMMMRRRMMM";

Map<char, int> voteTally;
for (char v : allVotes) {
    voteTally[v]++;
}

// loop over the map
for (char initial : voteTally) {
    int numVotes = voteTally[initial];
    cout << initial << ": " << numVotes << " votes" << endl;
}
  • Why does voteTally[v]++; work? It turns out that when you access a Map's value, you get back a reference! Therefore, updating it here allows it to update inside the map. Cool!
  • It is equivalent to the following:
    int& currentTotal = voteTally[v];
    currentTotal++; // update inside the map
    
  • Notice how we looped over the map – we only get the keys, and have to manually ask for each value from the key.

Slide 18

Tallying Words

  • Let's write a program to tally how many of each word is in a file

The Qt Creator logo