Today: new data type dict and the dict-count algorithm

Ripped From the Headlines - Chips

Bloomberg - chips are hard to make

Story: cars, computers, phones... there is a worldwide chip shortage in 2021. Chips are in everything. Moore's law is about the chips getting cheaper when produced at scale. Chips=cheap is why they show up in everything. BUT the chip factories have gotten to be very expensive - 15 billion dollars per factory, running with huge volumes of chips. Predictably, there are very few such factories supplying the world, and they happen to be centered on Taiwan. It is a fragile situation.

Quiz Mon

Dict - Hash Table - Fast

Dict Story Arc

Dict = Memory

Dict Basics

See Python Guide for more detail: Python Dict

alt:python dict key/value pairs 'a'/'alpha' 'g'/'gamma' 'b'/'beta'

Dict-1. Set key:value into Dict

>>> d = {}             # Start with empty dict {}
>>> d['a'] = 'alpha'   # Set key/value
>>> d['g'] = 'gamma'
>>> d['b'] = 'beta'
>>> # Now we have built the picture above
>>> # Python can display dict with the literal syntax
>>> d
{'a': 'alpha', 'g': 'gamma', 'b': 'beta'}
>>>

Dict-2: Get value out of Dict

>>> d['g']             # Get by key
'gamma'
>>> d['b']
'beta'
>>> d['a'] = 'apple'   # Overwrite 'a' key
>>> d['a']
'apple'
>>>
>>> # += modify value
>>> d['a'] += '!!!'
>>> d['a']
'apple!!!'
>>> # Dict literal format, curl-braces, key:value
>>> d
{'a': 'apple!!!', 'g': 'gamma', 'b': 'beta'}
>>>

Dict-3 Get Error / "in" Test

>>> # Can initialize dict with literal
>>> d = {'a': 'alpha', 'g': 'gamma', 'b': 'beta'}
>>>
>>> d['x']             # Key not in -> Error
Error:KeyError('x',)
>>>
>>> 'a' in d           # "in" key tests
True
>>> 'x' in d
False
>>> 
>>> # Guard pattern
>>> if 'a' in d:
      val = d['a']
>>>
>>> # in uses key, not value, so this does not work:
>>> 'alpha' in d 
False
>>>

Dict Memory Example - Meals

Use dict to remember that 'breakfast' is 'apple' and 'lunch' is 'donut'. Using 'breakfast' and 'lunch' as keys.

>>> meals = {}
>>> meals['breakfast'] = 'apple'
>>> meals['lunch'] = 'donut'
>>>
>>> # time passes, other lines run
>>>
>>> # what was lunch again?
>>> meals['lunch']
'donut'
>>> 
>>> # did I have breakfast or dinner?
>>> 'breakfast' in meals
True
>>> 'dinner' in meals
False
>>>

Basic Dict Code Examples - Meals

Look at the dict1 "meals" exercises on the experimental server

> dict1 meals exercises

With the "meals" examples, the keys are 'breakfast', 'lunch', 'dinner' and the values are like 'hot dot' and 'bagel'. A key like 'breakfast' may or may not be in the dict, so need to "in" check first.

Case to think bout: dict[key]

The code can only get the value for a key, if the key is in the dict. Otherwise it's an error. Therefore, need to structure the code with an "in" guard check or something to make sure the key is in the dict before trying to get its value.

bad_start()

> bad_start()

bad_start(meals): Given a "meals" dict which contains key/value pairs like 'lunch' -> 'hot dog'. The possible keys are 'breakfast', 'lunch', 'dinner'. Return True if there is no 'breakfast' key in meals, or the value for 'breakfast' is 'candy'. Otherwise return False.

bad_start() Solution Code

Question: is the meals['breakfast'] == 'candy' line safe? Yes. The if-statement guards the [ ].

def bad_start(meals):
    if 'breakfast' not in meals:
        return True
    if meals['breakfast'] == 'candy':
        return True
    return False
    # Can be written with "or" / short-circuiting avoids key-error
    # if 'breakfast' not in meals or meals['breakfast'] == 'candy':

enkale()

> enkale()

enkale(meals): Given a "meals" dict which contains key/value pairs like 'lunch' -> 'hot dog'. The possible keys are 'breakfast', 'lunch', 'dinner'. If the key 'dinner' is in the dict with the value 'candy', change the value to 'kale'. Otherwise leave the dict unchanged. Return the dict in all cases.

enkale() Solution Code

Demo: work out the code, see key error

Cannot access meals['dinner'] in the case that dinner is not in the dict, so need logic to avoid that case.

def enkale(meals):
    if 'dinner' in meals and meals['dinner'] == 'candy':
        meals['dinner'] = 'kale'
    return meals

Typical pattern: "in" check guards the meals['dinner'] access, since the short-circuit and only proceeds when the first test is True. Could write it out in this longer form which is ok - works exactly the same as the above and/short-circuit form:

def enkale(meals):
    if 'dinner' in meals:
        if meals['dinner'] == 'candy':
            meals['dinner'] = 'kale'
    return meals

Exercise: is_boring()

> is_boring()

is_boring(meals): Given a "meals" dict. We'll say the meals are boring if lunch and dinner are both present and are the same thing. Return True if the meals are boring, False otherwise.

Dict Observations

Dict Random Order

"in" Guard Pattern

Key and Value - Different Roles

Dict vs. List - Keys


Dict-Count Algorithm

Dict Count Code Examples

> dict2 Count exercises

Dict-Count Algorithm Steps

Dict-Count abacb

Go through these strs
strs = ['a', 'b', 'a',  'c',  'b']

Sketch out counts dict here:

Counts dict ends up as {'a': 2, 'b': 2, 'c': 1}:

alt: counts a 2 b 2 c 1

1. str-count1() - if/else

> str_count1()

str_count1 demo, canonical dict-count algorithm

str_count1() Solution

def str_count1(strs):
    counts = {}
    for s in strs:
        # s not seen before?
        if s not in counts:
            counts[s] = 1   # first time
        else:
            counts[s] +=1   # every later time
    return counts

2. str-count2() - Unified/Invariant Version, no else

> str_count2()

Standard Dict-Count Code - Unified/Invariant Version

def str_count2(strs):
    counts = {}
    for s in strs:
        if s not in counts:  # fix counts/s if not seen before
            counts[s] = 0
        # Invariant: now s is in counts one way or
        # another, so can do next step unconditionally
        counts[s] += 1
    return counts

Int Count - Exercise

> int_count()

Apply the dict-count algorithm to a list of int values, return a counts dict, counting how many times each int value appears in the list.

Char Count - Exercise

> char_count()

Apply the dict-count algorithm to chars in a string. Build a counts dict of how many times each char, converted to lowercsae, appears in a string so 'Coffee' returns {'c': 1, 'o': 1, 'f': 2, 'e': 2}.