Today: part-1: coding style and design. part-2: string foreach, lists

Large Code Projects - Deceptively Difficult

Readable

Readable 1.0 - Good Function Names

if is_url_sketchy(url):
  ...


delete_files(files)


if distance(loc1, loc2) < 1.0:
  ...


# Is "compute_distance" a better name?
# In this case the one word reads fine IMHO,
if compute_distance(loc1, loc2) < 1.0:
  ...

Readable 2.0 - Good Variable Names

Variable Names Pay Off Right Now

You are writing a 10 line function. You have data that flows through, changes from line to line. You need to keep track of these in your own mind as you go from line to line to get this function written. Good variable and function names are big help here.

brackets() Code - Good Var Names

Previous lecture example - "left" is a fine variable name in there. "x" or "i" would not be good choices.

brackets(s): Look for a pair of brackets '[...]' within s, and return the text between the brackets, so the string 'cat[dog]bird' returns 'dog'. If there are no brackets, return the empty string. If the brackets are present, there will be only one of each, and the right bracket will come after the left bracket.

def brackets(s):
    left = s.find('[')
    if left == -1:
        return ''
    right = s.find(']')
    return s[left + 1: right]

brackets() with Bad Var Names

Here is a buggy version of brackets() with bad variables. Look at the last line. Is that line correct? For each var, you have to look up to remind yourself what value it is. That's a bad sign! Better that the name of the variable just tells the story right there.

def brackets(x):
    z = x.find('[')
    if z == -1:
        return ''
    y = x.find(']')
    return x[y + 1: z]  # buggy?

Variable Name Choices for "left"

int_index_of_left_paren   # Too long.
                          # Do not spell out
                          # every true thing.
index_of_left_paren       # Too long.

left_index            # fine
left                  # fine
li                    # too short/cryptic
l                     # too short, and don't use "l"

Exceptions: Idiomatic 1 Letter / Short Var Names

"idiomatic" - a common practice by many programmers, so it becomes a readable, recognizable shorthand.

Decomp By Var Strategy

Decomp By Var Example Problem 'x3412y'

This is a classic make-a-drawing index problem. Getting this perfect is not so easy.

Function: Given a string s of even length, if the string length is 2 or less, return it unchanged. Otherwise take off the first and last chars. Consider the remaining middle piece. Split the middle into front and back halves. Swap the order of these two halves, and return the whole thing with the first and last chars restored. So 'x1234y' returns 'x3412y'.

Decomp By Var Solution

The variable names here help us keep the various parts clear through the narrative, even at the moment we are working out each line. The variable names are naturally similar to those in the specification.

def foo(s):
    if len(s) <= 2:
        return s
    first = s[0]
    last = s[len(s) - 1]
    mid = s[1:len(s) - 1]
    halfway = len(mid) // 2
    return first + mid[halfway:] + mid[:halfway] + last

The variable names don't have to be super detailed. Just enough to label the concepts through this narrative. Note that the one letter "s" is fine - there is nothing semantic about s that we need to keep track of beyond it's a string. In contrast, "first" "last" etc. have specific roles in the algorithm.

Point here: writing this function with a blank screen. Use good variable names to pick off and name parts of the problem as you work ahead.

The variables are sort of divide-and-conquer within the function - separate out and name individual steps of the algorithm vs. doing it in 1 big jump.

Bad Solution - No Decomp By Var

Here is the above function written without any good variables. Just because something is 1 line, does not make it better. I believe it's correct, but it' hard to tell!

This is a good example of not readable.

def foo(s):
    if len(s) <= 2:
        return s
    return (s[0] + s[1:len(s) - 1][(len(s) - 2) // 2:] +
            s[1:len(s) - 1][:(len(s) - 2) // 2] + s[len(s) - 1])

The bad code also repeats computations, like (len(s) - 2) // 2. The good solution computes that value once and stores it in the variable halfway for use by later lines.

Trick: If You Cannot Get A Line Working

Avoid Needless Computation in Loop - Store in Var

Suppose we have this loop - n copies of the lowercase form of s. This code is fine, we will just point out a slight improvement.

def n_copies(s, n):
    result = ''
    for i in range(n):
        result += s.lower()
    return result

Notice that s.lower() computes the lowercase form of s in the loop. The readability is fine, but the code computes that lowercase form again and again and again. The lowercase of 'Hello' is the same 'hello' every time through the loop. This is a little wasteful. Could compute it once, store in a variable, use the variable in the loop:

def n_copies(s, n):
    result = ''
    low = s.lower()
    for i in range(n):
        result += low
    return result

This is a slight improvement. It would be especially important if the s.lower() computation was costly. This issue appears in HW4. The first job is calling the helper function to get the right data in hand. A lesser question is - does this value need to be computed every time through the loop, or can we just compute it once?


Big Picture Software Costs - N2

N Squared Trap

alt: hours to finish is proportionate to number of lines squared

Decomposition - Escape N-Squared Trap

Black Box Model - 1. Abstraction

Black Box - 2. Implementation

Ride To Airport Abstraction vs. Implementation

How To Write a Program - Avoiding n2 Trap

Abstraction in CS

# get list of filenames in named directory
filenames = os.listdir('Downloads')

# Get the current date and time
now = datetime.now()

Mechanics: Fn name, PyDoc, Doctests

def del_chars(s, target):
    """
    Given string s and a "target" string,
    return a version of s with all chars that
    appear in target removed, e.g. s 'abc'
    with target 'bx', returns 'ac'.
    (Not case sensitive)
    >>> del_chars('abC', 'acx')
    'b'
    >>> del_chars('ABc', 'aCx')
    'B'
    >>> del_chars('', 'a')
    ''
    """
    result = ''
    target = target.lower()
    for i in range(len(s)):
        if s[i].lower() not in target:
            result += s[i]
    # could use "for char in s" form, since not using index
    return result

How Not To Write a Program

How To Write a Program


string foreach

String Foreach Examples

> String Foreach examples

double_char2() example with foreach

def double_char2(s):
    result = ''
    for ch in s:
        result = result + ch + ch
    return result

Python Lists

See guide: Python List for more details about lists

1. List Literal: [1, 2, 3]

Use square brackets [..] to write a list in code (a "literal" list value), separating elements with commas

>>> lst = ['a, 'b', 'c']

"empty list" is just 2 square brackets with nothing within: []

2. Length of list: len(lst)

Use len() function, just like string

>>> len(lst)
3

3. Square Brackets to access element

Use square brackets to access an element in a list, like string again (bad index err possible). Valid index numbers are 0..len-1.

>>> lst[0]
'a'
>>> lst[2]
'c'
>>> lst[9]
Error:list index out of range

List Mutable

The big difference from strings is that lists are mutable - lists can be changed. Elements can be added, removed, changed over time.

1. List append()

# 1. make empty list, then call .append() on it
>>> lst = []         
>>> lst.append('a')
>>> lst.append('b')
>>> lst.append('c')
>>> 
>>> lst
['a', 'b', 'c']
>>> len(lst)
3
>>> lst[0]
'a'
>>> lst[2]
'c'
>>>
# 2. Similar, using loop/range to call .append()
>>> lst = []
>>> for i in range(6):
...     lst.append(i * 10)
... 
>>> lst
[0, 10, 20, 30, 40, 50]
>>> len(lst)
6
>>> lst[5]
50

2. List "in" / "not in" Tests

>>> lst = ['a', 'b', 'c']
>>> 'c' in lst
True
>>> 'x' in lst
False
>>> 'x' not in lst  # preferred form to check not-in
True
>>> not 'x' in lst  # not preferred equivalent
True

3. Foreach On List

>>> lst = ['a', 'b', 'c']
>>> for s in lst:
...   # use s in here
...   print(s)
... 
a
b
c

4. list.index(target) - Find Index of Target

>>> lst = ['a', 'b', 'c']
>>> lst.index('c')
2
>>> lst.index('d')
ValueError: 'd' is not in list
>>> 'd' in lst
False
>>> 'c' in lst
True

List Code Examples

> list1 examples

Constants in Python

STATES = ['CA, 'NY', 'NV', 'KY', 'OK']

e.g. HW4 Crypto

# provided ALPHABET constant - list of the regular alphabet
# in lowercase. Refer to this simply as ALPHABET in your code.
# This list should not be modified.
ALPHABET = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

...

def foo():
    for ch in ALPHABET:  # this works
        print(ch)

main() - Monday