L22

Today: lambda output, lambda/def, custom sorting with lambda, wordcount sorting, introduction to modules

Lambda - You The Power Hacker

Lambda is powerful feature, letting you express a lot of computation in very little space. As a result, it's weird looking at first, but when it clicks, you should feel like a Power Hacker when you wield it. Kind of a superpower.

These are well suited to little in-class exercises .. just one line long. Not easy, but they are short!

Recall: Lambda 1-2-3 Steps

1. The word "lambda"

2. What type of element? - choose a good name for the parameter: n:, s:, ...

3. Write expression to produce, no "return"

Lambda-2 Exercises

> lambda-2 section

The first of these work on a list of (x, y) tuples. These are a little more complicated but packing even more power into the one line.

int_to_str()

> int_to_str()

Given list of ints nums. Return a list of strings, where each int has parenthesis added, so 3 becomes the string '(3)'. Note that str(n) converts an int to a str.

[13, 42, 0] -> ['(-13)', '(42)', '(0)']

Solution

def int_to_str(nums):
    return map(lambda n: '(' + str(n) + ')', nums)

min_x()

> min_x()

Given a non-empty list of len-2 (x, y) tuples. What is the leftmost x among the tuples? Return the smallest x value among all the tuples, e.g. [(4, 2), (1, 2) (2, 3)] returns the value 1. Solve with a map/lambda and the builtin min(). Recall: min([4, 1, 2]) returns 1

[(4, 2), (1, 2), (2, 3)]  -> 1

Solution

def min_x(points):
    return min(map(lambda point: point[0], points))
    # Use map/lambda to form a list of
    # just the x coords. Feed that into min()

Recall: What Is Def?

Previously, here is def

def double n:
    return 2 * n

The def sets up the name of the function, and associates it with that body of code. Like this drawing:

alt: name double points to black box of code

What is Lambda?

Lambda is code, just like what the def creates, but without the name pointing to it. Can kind of see this in the interpreter.

>>> def double(n):
...   return 2 * n
... 
>>> double
<function double at 0x7fb944ab6ee0>
>>> 
>>> lambda n: 2 * n
<function <lambda> at 0x7fb944ad03a0>

Thus far we've let map() call the lambda code for us. Is there another way to call the lambda code?

Replicate a Def With a Lambda?

Just to show how Python works, you can actually make your own def using lambda and an equal sign. A def has code and a name. Here we use = to make the name fn point to the lambda code. Then we can call it like any other function.

>>> fn = lambda n: 2 * n
>>> 
>>> fn       # fn points to code
<function <lambda> at 0x7fb944ad0700>
>>>
>>>
>>>
>>> fn(10)   # function call works
20
>>> fn(12)
24
>>>

That is not something you need to do to get work done. That's a peak behind the curtain, showing what def is doing under the hood. Python is in a way very simple. A variable means that name has a pointer to that value. We see here that functions work the same way - a name pointing to a value which happens to be code.

Lambda vs. Def

Lambda and def are similar:

def double(n):
    return 2 * n

Equivalent lambda

lambda n: 2 * n

Use Lambda For Everything?

Should you just use lambda for everything? Not at all! Lambda is good for cases where the code is really short. Your program will have situations like that sometimes, and lambda is great for that. But def can do many things lambda cannot.

Def Features

Not everything needs to be a lambda
def introduces a name for the code
Def has room for real code features:
Multiple lines
If statements
Variables
Loops
Doctests
Inline comments
Lambda: best without any of that, just short, 1-line

Def vs. Lambda

What to do if computation does not fit in 1 line?
Just write a def
map() can use the def

map/def Example - map_parens()

> map_parens()

In lambda1, see the map_parens() problem.

['xx(hi)xx', 'abc(there)xyz', 'fish'] ->
  ['hi', 'there', 'fish']

map_parens() Solution

Solution Code. map() works fine with "parens" by name

def parens(s):
    left = s.find('(')
    right = s.find(')', left)
    
    if left == -1 or right == -1:
        return s
    return s[left + 1:right]


def map_parens(strs):
    return map(parens, strs)

Custom Sort - Power Feature

Python sorting has a lot of power in it
Use lambda to guide the sorting
This code feels powerful and dense
More examples in section!

Python Custom Sort - Food Examples

Lamest food I could think of - Radish
Suppose have list of foods
Each food is a len-3 food tuple:
food = (name, tasty 1-10, healthy 1-10)
food[0] = its name
food[1] = how tasty it is 1-10
food[2] = how healthy it is 1-10

We'll try these food examples in the interpreter.

Default sorted()

By default sorted() works on list of tuples, compares [0] first, then [1], and so on

>>> foods = [('radish', 2, 8), ('donut', 10, 1), ('apple', 7, 9), ('broccoli', 6, 10)]
>>> 
>>> # By default, sorts food tuples by [0]
>>> sorted(foods)
[('apple', 7, 9), ('broccoli', 6, 10), ('donut', 10, 1), ('radish', 2, 8)]
>>>

Sort By Tastiness

Say I want to sort by tastiness
e.g. the radish vs. donut dimension
Control how sorted() looks a the data
Like drawing a circle around tasty values - sort by these!
How can we get the code to do this?

Project Out Sort-By Values

How to code sort-by-tasty
For each element in list
"Project out" a sort-by value to be used in sorting comparisons
Here, for each food, project out its tasty int
aka "Proxy" strategy
Each element, proxy value is used for sorting comparisons

Project Out With Lambda

Q: how to project out these sort-by proxy values?
A: lambda

Custom Sort Lambda - Plan

1. Call sorted() as usual
2. provide key=lambda to control sorting
Lambda here takes one parameter - an elem from the list
The lambda projects out the sort-by value to use for comparisons
e.g. sort by tasty
lambda food: food[1]
e.g. sort by healthy
lambda food: food[2]

Q: What is the parameter to the lambda?

A: One elem from the list (similar to map() function)

Sort By Tasty

>>> foods = [('radish', 2, 8), ('donut', 10, 1), ('apple', 7, 9), ('broccoli', 6, 10)]
>>> 
>>> sorted(foods, key=lambda food: food[1])
[('radish', 2, 8), ('broccoli', 6, 10), ('apple', 7, 9), ('donut', 10, 1)]

Most Tasty (reverse=True)

>>> sorted(foods, key=lambda food: food[1], reverse=True)  # most tasty
[('donut', 10, 1), ('apple', 7, 9), ('broccoli', 6, 10), ('radish', 2, 8)]

Most Healthy

>>> sorted(foods, key=lambda food: food[2], reverse=True)  # most healthy
[('broccoli', 6, 10), ('apple', 7, 9), ('radish', 2, 8), ('donut', 10, 1)]

Most tasty * healthy

Not limited to just projecting out existing values. We can project out a computed value. Here we compute tasty * healthy and sort on that. So apple is first, 7 * 9 = 63, broccoli is second with 6 * 10 = 60. Donut is last :(

>>> sorted(foods, key=lambda food: food[1] * food[2], reverse=True)
[('apple', 7, 9), ('broccoli', 6, 10), ('radish', 2, 8), ('donut', 10, 1)]
>>>

Sorted vs. Min Max

What code give us the most tasty food?
Or the least tasty?
Sorting n things is kind of expensive
Could sort, take the last item - overly expensive approach
Use max(), max takes a key=lambda just like sorted()
e.g. pull out most or least tasty food - change "sorted" to "max" or "min"

>>> foods = [('radish', 2, 8), ('donut', 10, 1), ('apple', 7, 9), ('broccoli', 6, 10)]
>>> max(foods)     # uses [0] by default - tragic!
('radish', 2, 8)
>>> 
>>> sorted(foods, key=lambda food: food[1])
[('radish', 2, 8), ('broccoli', 6, 10), ('apple', 7, 9), ('donut', 10, 1)]
>>> 
>>> max(foods, key=lambda food: food[1])  # most tasty
('donut', 10, 1)
>>> min(foods, key=lambda food: food[1])  # least tasty
('radish', 2, 8)

Key performance point: computing one max/min element is much faster than sorting all n elements.

Python Custom Sort String Examples

Default sorted() uses "<"
With strings, < places uppercase before lowercase, rarely what we want

>>> # The default sorting is not good with upper/lower case
>>> strs = ['coffee', 'Donut', 'Zebra', 'apple', 'Banana']
>>> sorted(strs)
['Banana', 'Donut', 'Zebra', 'apple', 'coffee']

String Sort Lambda

Fix: project out lowercase version of string as sort-by
The lambda takes in one elem from list - in this case 1 string
e.g. lambda s: s.lower()
Examples: sort not case-sensitive, sort by last char

>>> strs = ['coffee', 'Donut', 'Zebra', 'apple', 'Banana']
>>> 
>>> sorted(strs, key=lambda s: s.lower())    # not case sensitive
['apple', 'Banana', 'coffee', 'Donut', 'Zebra']
>>> 
>>> sorted(strs, key=lambda s: s[len(s)-1])  # by last char
['Zebra', 'Banana', 'coffee', 'apple', 'Donut']
>>>

Movie Examples

Given a list of movie tuples, (name, score, date-score), e.g.

[('alien', 8, 1), ('titanic', 6, 9), ('parasite', 10, 6), ('caddyshack', 4, 5)]

sort_score(movies)

> sort_score()

Given a list of movie tuples, (name, score, date-score), where score is a rating 1-10, and date 1-10 is a rating as a "date" movie. Return a list sorted in increasing order by score.

sort_date(movies)

> sort_date()

Given a list of movie tuples, (name, score, date-score), where score is a rating 1-10, and date-score 1-10 is a rating as a "date" movie. Return the list sorted in decreasing by date score.

Put It All Together - WordCount + Sorted

Look at wordcount project, apply custom sorting to the output stage.

Sorted vs. Dict Count Items

Wordcount has a "counts" dict, key is a word, value is its count
Use counts.items()
Gives us a "items" list of pairs: (char, count)
I'll use "items" as the var name here, echoing the "d.items()" function name
Use "pair" as lambda parameter, (char, count)

>>> items = [('z', 1), ('a', 3), ('e', 11), ('b', 3), ('c', 2)]

Copy that items list into interpreter, try these code challenges
Questions we could ask of the items - demo or you-try-it
1. How to sort items in increasing order by char (easy!)
2: How to sort items in increasing order by count?
3. How to sort items in decreasing order by count?
4: How to access the pair with the largest count?

>>> items = [('z', 1), ('a', 3), ('e', 11), ('b', 3), ('c', 2)]
>>> 
>>> # sort by [0]=word is the default
>>> sorted(items)
[('a', 3), ('b', 3), ('c', 2), ('e', 11), ('z', 1)]
>>> 
>>> sorted(items, key=lambda pair: pair[1])   # sort by count
[('z', 1), ('c', 2), ('a', 3), ('b', 3), ('e', 11)]
>>> 
>>> sorted(items, key=lambda pair: pair[1], reverse=True)
[('e', 11), ('a', 3), ('b', 3), ('c', 2), ('z', 1)]
>>> 
>>> max(pairs, key=lambda pair: pair[1])      # largest count
('e', 11)

Wordcount - Top-Count - Lambda

Here is the WordCount project we had before. This time look at the print_counts() and print_top() functions.

> wordcount.zip

print_counts() - Alphabetic Output

Here is the output of the regular print_counts() function, which prints out in alphabetic order. Output looks like:

$ python3 wordcount.py poem.txt 
are 2
blue 2
red 2
roses 1
violets 1
$

print_counts() Solution

This is the standard dict-output sorted loop.

def print_counts(counts):
    """
    Given counts dict, print out each word and count
    one per line in alphabetical order, like this
    aardvark 1
    apple 13
    ...
    """
    for word in sorted(counts.keys()):
        print(word, counts[word])
    # Alternately use .items() to access all the key/value data
    # for key, value in sorted(counts.items()):
    #    print(key, value)

print_top()

The print_top(counts, n) function - print the n most common words in decreasing order by count.

$ python3 wordcount-solution.py -top 10 alice-book.txt 
the 1639
and 866
to 725
a 631
she 541
it 530
of 511
said 462
i 410
alice 386

Look at print_top() function
Print the items list to see what we have (Python debug technique)
Recall: dict.items() - random order of word/count pairs
[('sister', 12), ('rabbit', 5), ...]
Need to sort the pairs: decreasing order by count
Use sorted/lambda
This code is incredibly short and powerful

print_top() Exercise

def print_top(counts, n):
    """
    Given counts dict and int N, print the N most common words
    in decreasing order of count
    the 1045
    a 672
    ...
    """
    items = counts.items()
    # Could print the items in raw form, just to see what we have
    # print(items)
    pass
    # Your code - my solution is 3 lines long, but it's dense!
    # Sort the items with a lambda so the most common words are first.
    # Then print just the first N word,count pairs with a slice

print_top() Solution

Here's the lines - sort by count decreasing order. Then slice to take the top n.

    # 1. Sort largest count first
    items = sorted(items, key=lambda pair: pair[1], reverse=True)
    # 2. Slice to grab first N
    for word, count in items[:n]:
        print(word, count)