L24

Today: advanced lambda sort, whole program example - pylibs

Advanced Sorting Exercises

More advanced uses of sorted/lambda.

Demo/Exercise: close10()

Given a list of 3 or more int numbers. Return the 3 numbers closest to 10, sorted into increasing order. For example the input [1, 10, 2, 9, 3, 12] returns [9, 10, 12]. Note: abs(n) is the absolute value function. Use sorted/lambda.

[1, 10, 2, 9, 3, 12] ->    [9, 10, 12]
[100, 5, 200, 15, 8, 4] -> [5, 8, 15]

Idea: For each number, consider the distance between that number and 10. Compute the distance with abs(). Sort the numbers according to this distance.

Exercise: midpointy()

> midpointy()

[9, 4, 6, 5, 1, 7]
# midpoint is (9 + 1) / 2 -> 5.0
# result ->  [4, 5, 6]

Given a list of 3 or more int numbers. We'll say the midpoint is the float average of the min and max numbers in the list. Return a list of the 3 numbers closest to the midpoint, sorted into increasing order. Use sorted/lambda.

Idea: Compute the midpoint. Sort the numbers by their distance from the midpoint with abs().

Writing a Whole Program

Start with big problem
Divide and Conquer
Helper functions for sub-problems
Solve them individually
Black box: parameters in, return value out
Doctests per function
Fit the functions together, call from main()
Profit!
Top down decomposition
Start with main()
Think up helpers that would be useful
Then go write them

Pylibs Exercise

Today we'll do a whole program in class to walk through the whole process.

Download pylibs.zip to get started. We'll work through this together.

Starter file is mostly blank
Start with main()
Think up helper functions as we go
At each step - imagine what a useful helper function would be

Pylibs Problem

First, look at what problem we want to solve - like Madlibs.

Say we have two files, a "terms" file and a "template" file. (It's handy to have terminology for the parts of your abstract problem to then use in yours docs, var names, etc.):

1. Terms file

The "terms" file provides a list of words for each category, such as 'noun'. One category per line, separated by commas, like this:

noun,cat,donut,velociraptor
verb,nap,run

2. Template file

The "template" file has text, and within it are markers like "[noun]" where a random substitution should be done.

I had a [noun]
and it liked to [verb] all day

Command Line

We want to run this program giving it the terms and templates files, and get the output like this

$ python3 pylibs.py test-terms.txt test-template.txt 
I had a velociraptor 
and it liked to nap all day

Let's do it. Write code in pylibs.py

1. Look at main() - Think of Useful Helper

Have terms and template filenames
What would be a useful helper here?
Helper idea: read_terms()
in: terms filename
out: terms dict

2. Write code: read_terms(filename)

Read terms file, build and return dict
First word on each line is like 'noun'
Use split(',')
Look at inputs and outputs below to get started

Looking at the input and desired output data is a nice way to get started on the code. Input line from terms file like this:

noun,cat,rabbit,velociraptor

Use line = line.strip() to remove newline. Use parts = line.split(',') to separate on the commas.

Create entry in terms dict like:

'noun': ['cat', 'rabbit', 'velociraptor']

File 'test-terms.txt' - write a Doctest

noun,cat,donut,velociraptor
verb,nap,run

Write a Doctest so we know this code is working before proceeding: read_terms('test-terms.txt')

Doctest trick: could just run the Doctest, look at what it returns, paste that into the Doctest as the desired output if it looks right.

read_terms() Solution

Here is our solution complete with docs and doctest - in class, anything that works is doing pretty well.

def read_terms(filename):
    """
    Given the filename of the terms file, read
    it into a dict with each 'noun' word as a key
    and its value is its list of subs ['apple', 'donut', 'unicorn'].
    Return the terms dict.
    >>> read_terms('test-terms.txt')
    {'noun': ['cat', 'donut', 'velociraptor'], 'verb': ['nap', 'run']}
    """
    terms = {}
    with open(filename) as f:
        for line in f:
            # line is: noun,apple,rabbit,velociraptor,balloon
            line = line.strip()  # remove \n
            parts = line.split(',')
            term = parts[0]    # 'noun'
            words = parts[1:]  # ['apple', 'rabbit' ..]
            terms[term] = words
    return terms

3. main() Again

Call: terms = read_terms(args[0])
What is next helper to call from here?
How about: process_template(terms-dict, filename)
Reads through template file, prints out text with substitutions

main() - calls two helpers, just need to write them

    # command line: terms-file template-file
    if len(args) == 2:
        terms = read_terms(args[0])
        process_template(terms, args[1])

4. Write code: process_template(terms, filename)

Here is the beginning code for process_template() which starts with the standard file for/line/f loop.

Handy trick - use line.split() (no parameters) to get the list of words that make up each line. This also takes care of the \n at the end.

line.split() -> ['I', 'had', 'a', '[noun]']

You can paste this in to get started.

def process_template(terms, filename):
    with open(filename) as f:
        for line in f:
            words = line.split()  # ['I', 'had', 'a', '[noun]']
            # Print each word with substitution done

Q: What would be a useful helper to have here?

A: A function that did the substitution for one word, so calling it with '[noun]' returns 'apple' would be handy here - decompose that out.

5. Write code: substitute(terms, word)

If word is of the form '[noun]' return a random substitute for it from the terms dict. Otherwise return the word unchanged.

Note 1: s.startswith() / s.endswith() very handy here to look for square brackets

Note 2: random.choice(lst) returns a random element from list.

Here our solution has all the Doctests added, but for in-class anything that works is fine.

This is a nice example of a helper function: (1) isolates some complexity within this function were we can solve and test it. (2) Also makes its caller function more tractable.

substitute() Solution

def substitute(terms, word):
    """
    Given terms dict and a word from the template.
    Return the substituted form of that word.
    If it is of the form '[noun]' return a random
    word from the terms dict. Otherwise
    return the word unchanged.
    >>> substitute({'noun': ['apple']}, '[noun]')
    'apple'
    >>> substitute({'noun': ['apple']}, 'donut')
    'donut'
    """
    if word.startswith('[') and word.endswith(']'):
        word = word[1:len(word) - 1]  # trim off [ ]
        if word in terms:
            subs = terms[word]  # list of ['apple', 'donut', ..]
            return random.choice(subs)
    return word

6. Complete process_template(), calling substitute()

Note: print a word followed by one space and no newline:
print(word + ' ', end='')

The end='' option for print() suppresses the printed newline. The have a single print() after the loop to print one newline.

            ...
            words = line.split()
            # Print each word with substitution done
            for word in words:
                sub = substitute(terms, word)
                print(sub + ' ', end='')
            print()

7. Run from main()

We have main() process_file() and substitute() wired together. Try it from the command line, with the files 'terms.txt' and 'template.txt'

$ cat terms.txt 
noun,velociraptor,donut,ray of sunshine
verb,run,nap,eat the bad guy
adjective,blue,happy,flat,shiny
$
$ cat template.txt 
I had a [noun] and
it was very [adjective]
when it would [verb]
$ 
$ python3 pylibs.py terms.txt template.txt 
I had a ray of sunshine and 
it was very shiny 
when it would nap
$
$ python3 pylibs.py terms.txt template.txt 
I had a velociraptor and 
it was very shiny 
when it would eat the bad guy 
$

Demo HW7a Ghost

Handout out now, don't need to start right away
Very algorithmic project
Leverage sorted()/lambda
Look at image series - think about outlier
clock tower
monster