Today: new data type dict and the dict-count algorithm

Dict - Hash Table - Fast

For more details sees the chapter in the Guide: Dict

Dict Story Arc

Dict Restaurant Order Story

Suppose you are out ordering dinner at a restaurant, and the order is proceeding in a chaotic way like this

Alice: I'd like to start with a cup of gazpacho
Bob:   I like beignets for dessert
Alice: Then a ceaser salad
Zoe:   I'll have lasagna
Bob:   Actually two orders of beignets
Alice: Then I'll have tacos
Bob:   And a hot dog
...

People mention the parts of their order piece by piece in no organized order - fine. However, what is needed for the kitchen is to organize each order by person:

Alice: gazpacho, ceasar, tacos
Bob: hot dog, two orders of beignets
...

This is what the dictionary does - data comes in randomly and the dict can organize it by a chosen part of the data, here the name.

Dict Basics

alt:python dict key/value pairs 'a'/'alpha' 'g'/'gamma' 'b'/'beta'

Dict-1 - Set key:value into Dict

>>> d = {}             # Start with empty dict {}
>>> d['a'] = 'alpha'   # Set key/value
>>> d['g'] = 'gamma'
>>> d['b'] = 'beta'
>>> # Now we have built the picture above
>>> # Python can input/output a dict using
>>> # the literal { .. } syntax.
>>> d
{'a': 'alpha', 'g': 'gamma', 'b': 'beta'}
>>>

Dict-2 - Get value out of Dict

>>> s = d['g']         # Get by key
>>> s
'gamma'
>>> d['b']
'beta'
>>> d['a'] = 'apple'   # Overwrite 'a' key
>>> d['a']
'apple'
>>>
>>> # += modify value
>>> d['a'] += '!!!'
>>> d['a']
'apple!!!'
>>>
>>> d
{'a': 'apple!!!', 'g': 'gamma', 'b': 'beta'}
>>>

Dict-3 - Get Error / "in" Test

>>> # Can initialize dict with literal
>>> d = {'a': 'alpha', 'g': 'gamma', 'b': 'beta'}
>>>
>>> val = d['x']         # Key not in -> Error
Error:KeyError('x',)
>>>
>>> 'a' in d             # "in" key tests
True
>>> 'x' in d
False
>>> 
>>> # Guard pattern (else ..)
>>> if 'x' in d:
      val = d['x']
>>>
>>> # "in" uses key, not value, so this does not work:
>>> 'alpha' in d 
False
>>>

Dict = Memory, Meals Examples

At a high level, the dict is like memory - code can store a piece of information at one time, retrieve it later.

Meals problems: use dict to remember what food was eaten under the keys 'breakfast', 'lunch', 'dinner'. For example meals['breakfast'] = 'apple'`

>>> meals = {}
>>> meals['breakfast'] = 'apple'
>>> meals['lunch'] = 'donut'
>>>
>>> # time passes, other lines run
>>>
>>> # what was lunch again?
>>> meals['lunch']
'donut'
>>> 
>>> # did I have breakfast and dinner yet?
>>> 'breakfast' in meals
True
>>> 'dinner' in meals
False
>>>

Basic Dict Code Examples - Meals

Look at the dict1 "meals" exercises on the experimental server

> dict1 meals exercises

With the "meals" examples, the keys are 'breakfast', 'lunch', 'dinner' and the values are like 'hot dot' and 'bagel'. A key like 'breakfast' may or may not be in the dict, so need to "in" check first.

Inevitable case to think bout: dict[key]

The code can only get the value for a key, if the key is in the dict. Otherwise it's an error. Therefore, need to structure the code with an "in" guard check or something to make sure the key is in the dict before trying to get its value.

1. bad_start()

> bad_start()

bad_start(meals): Given a "meals" dict which contains key/value pairs like 'lunch' -> 'hot dog'. The possible keys are 'breakfast', 'lunch', 'dinner'. Return True if there is no 'breakfast' key in meals, or the value for 'breakfast' is 'candy'. Otherwise return False.

bad_start() Solution Code

Question: is the meals['breakfast'] == 'candy' line safe? Yes. The if-statement guards the [ ].

def bad_start(meals):
    if 'breakfast' not in meals:
        return True
    if meals['breakfast'] == 'candy':
        return True
    return False
    # Can be written with "or" / short-circuiting
    # if 'breakfast' not in meals or meals['breakfast'] == 'candy':

2. enkale()

> enkale()

enkale(meals): Given a "meals" dict which contains key/value pairs like 'lunch' -> 'hot dog'. The possible keys are 'breakfast', 'lunch', 'dinner'. If the key 'dinner' is in the dict with the value 'candy', change the value to 'kale'. Otherwise leave the dict unchanged. Return the dict in all cases.

enkale() Solution Code

Demo: work out the code, see key error

Cannot access meals['dinner'] in the case that dinner is not in the dict, so need logic to avoid that case.

def enkale(meals):
    if 'dinner' in meals and meals['dinner'] == 'candy':
        meals['dinner'] = 'kale'
    return meals

Typical pattern: "in" check guards the meals['dinner'] access, since the short-circuit and only proceeds when the first test is True. Could write it out in this longer form with two if-statements which is ok — works exactly the same as the above and/short-circuit form:

def enkale(meals):
    if 'dinner' in meals:
        if meals['dinner'] == 'candy':
            meals['dinner'] = 'kale'
    return meals

Exercise: is_boring()

> is_boring()

is_boring(meals): Given a "meals" dict. We'll say the meals dict is boring if lunch and dinner are both present and are the same food. Return True if the meals dict is boring, False otherwise.

Dict Observations

Dict Random Order

"in" Guard Pattern

Key and Value - Different Roles

Dict vs. List - Keys


Dict-Count Algorithm

Dict Count Code Examples

> dict2 Count exercises

Dict-Count Algorithm Steps

Dict-Count abacb

Go through these strs
strs = ['a', 'b', 'a',  'c',  'b']

Sketch out counts dict here:

Counts dict ends up as {'a': 2, 'b': 2, 'c': 1}:

alt: counts a 2 b 2 c 1

1. str-count1() - if/else

> str_count1()

str_count1 demo, canonical dict-count algorithm

str_count1() Solution

def str_count1(strs):
    counts = {}
    for s in strs:
        # s not seen before?
        if s not in counts:
            counts[s] = 1   # first time
        else:
            counts[s] +=1   # every later time
    return counts

2. str-count2() - Unified/Invariant Version, no else

> str_count2()

Standard Dict-Count Code - Unified/Invariant Version

def str_count2(strs):
    counts = {}
    for s in strs:
        # fix counts/s if not seen before
        if s not in counts:
            counts[s] = 0
        # Unified: now s is in counts one way or
        # another, so this works for all cases:
        counts[s] += 1
    return counts

Int Count - Exercise

> int_count()

Apply the dict-count algorithm to a list of int values, return a counts dict, counting how many times each int value appears in the list.

Char Count - Exercise

> char_count()

Apply the dict-count algorithm to chars in a string. Build a counts dict of how many times each char, converted to lowercase, appears in a string so 'Coffee' returns {'c': 1, 'o': 1, 'f': 2, 'e': 2}.