Today: talk about midterm a tiny bit, more next week. Main topic: Dictionaries, dict-count algorithm

## Dict - Hash Table - Fast

• Python "dict" type
• A key/value "dictionary"
• Generic term: "hash table"
Sounds like a real hacker thing
CS106B!
• Defining feature: powerful and fast

• String and list and int are crucial
• The dict type has a unique power in it
• Many advanced algorithms leverage that power
• Job interview pattern:
Interview question has some messed up data
Best answer inevitably uses a dict to organize the data
Because the dict is advanced and fast, its appearance is sort of inevitable

## Python Dict

• Organize data around a key
• For each key, store one value
• Get/set by key is fast
• Key type is typically a str or int (immutable)
• Value type can be anything (str, list, ...)
• Set: `d[key] = value`
• Get: `d[key]`
• Get from d[key] if key not in there = Error
• Check if key is present: `key in d`
or not present: `key not in d`
• Danger: before accessing d[key] - check that key is in first
• Note: the order of the keys in the dict is kind of random
It is the order they were added
Simplest to think of it as random

## First Dict Code Example

```>>> d = {}             # start as empty dict {}
>>> d['a'] = 'alpha'   # store key/values into d
>>> d['g'] = 'gamma'
>>> d['b'] = 'beta'
>>> d
{'a': 'alpha', 'g': 'gamma', 'b': 'beta'}  # curly-brace syntax
# order is somewhat random
>>> d['b']
'beta'
>>> d['a'] = 'apple'   # overwrite 'a' key
>>> d['a']
'apple'
>>> d['x']
Error:KeyError('x',)
>>> 'a' in d
True
>>> 'x' in d
False
>>> # Use += to modify
>>> d['a'] += '!!!'
>>> d['a']
'apple!!!'
>>>
>>> # Can write dict literal with { } syntax
# style: 1 space after colon and comma
>>> d = {'a': 'alpha', 'g': 'gamma', 'b': 'beta'}
```

## Dict = Memory

• At a high level, dict is memory
• Organized by key
• Suppose key is 'a'
• Your code can store at one time: `d['a'] = 12`
• Later, code can lookup: `d['a']`
• Get back the 12 stored earlier
• Speed: even if d contains 10 million keys, can access any key instantly

## Dict Memory Example

Use dict to remember that 'snack1' is 'apple' and 'snack2' is 'donut'. Using 'snack1' and 'snack2' as keys.

```>>> d = {}
>>> d['snack1'] = 'apple'
>>> d['snack2'] = 'donut'
>>>
>>> # time passes, other lines run
>>>
>>> # what was snack2 again?
>>> d['snack2']
'donut'
>>>
```

## Dict-Count Algorithm

• Important class of dict algorithms
• (Read: we'll use it a lot)
• Counts dict:
key for each distinct value
value for each key is count how many times that key appears
• e.g. strs: `'a', 'c', 'a', 'b'`
• creates "counts" dict: `{'a': 2, 'c': 1, 'b': 1}`

## Dict-Count Steps

• 2. For each string:
• 3. First time str seen? store key=str, value = 1
• 4. Seen before? key=str, value += 1

## 1. str-count1() - if/else

str_count1 demo, canonical dict-count algorithm

• Each s, key question: is this the first time seeing it?
• if/else solution
• if test first time?
Do one line if first time, counts[s] = 1
Do other line for all other: counts[s] += 1
• This approach is fine

Solution code

```def str_count1(strs):
counts = {}
for s in strs:
# s first time?
if s not in counts:
counts[s] = 1   # first time
else:
counts[s] += 1  # every later time
return counts
```

## 2. str-count2() - "Invariant" Version, no else

> 2. str-count2
• Same problem: is this the first time seeing s?
• Invariant approach:
• Run this line for all cases:
counts[s] += 1
• Precede it with if logic to "fix" counts if necessary
• "Invariant" means something which is true in all cases at some line
• Invariant means programmer can count on that being true - simpler
• I weakly prefer this version. It's one fewer lines and does not use else.
• All counting goes through that one += 1 line

## Standard Dict-Count Code - "invariant" Version

```def str_count2(strs):
counts = {}
for s in strs:
if s not in counts:  # make s be in there
counts[s] = 0
# Invariant: now s is in counts one way or
# another, so can do next step unconditionally
counts[s] += 1
return counts
```

## Int Count - You Try It

Apply the dict-count algorithm to a list of int values, return a counts dict, counting how many times each int value appears in the list.

> 3. int-count