L15

Today: better/shorter code techniques, boolean precedence

Note: we looked at keyboard shortcuts today for fun - see the Keyboard Shortcuts chapter

PEP8 Style Guide - Revisit Missing Pieces

Look at Style1 PEP8 page, mention a few things we skipped or did very briefly earlier.

1. Prefer the specific "not-equal" operators:

Write it this way: a != b

Not this way: not a == b

Likewise

Prefer this way a not in b

Not this way: not a in b

2. Do not write == True or == False

Not this way

# NO not this way
if a == True:
    do_something

This way

# YES this way
if a:
   do_something()

Inline Comments `#`

Use comments to fill in information that is useful or interesting and not clear from the code itself. These are not needed too often, as frequently the code is readable enough on its own (with good function and variable names).

1. Avoid repeating what the code says. Don't do this:

    i = i + 1   # Add 1 to i

2. Could mention what a line accomplishes - not so obvious sometimes - fill in the motivated goal:

    # Increase i to the next multiple of 10
    i = 10 * (i // 10 + 1)

3. Could mention the goal of a few lines, kind of framing what goal the next 8 lines accomplish. Can also use blank lines to separate logical stages from each other.

    # Now figure out where the email address ends
    end = at + 1
    while end < len(s) and ....
        ...

Is None Rule PEP8

Say we have a word variable. The following if-statement will work perfectly for CS106A, and you probably wrote it that way before, and it's fine and we will never mark off for it.

if word == None:             # Works fine, not PEP8
    print('Nada')

However, there is a rule in PEP8 that comparisons to the value None should be written with the is operator. This is an awkward rule, but you may have noticed PyCharm complaining about the above form, so you can write it as follows and it works correctly and is PEP8:

if word is None:             # PEP8 "is None" form
    print('Nada')


if word is not None:         # or PEP8 "is not None"
    print('Not Nada')

Very important limitation:

The is operator is different from == for most non-None values like strings and ints and lists. Therefore: never use the is operator for values other than None.

Only use is with None as above. If you use it with other values, it will lead to horrible, terrible bugs.

There's a longer explanation about this awkward "is" rule in the guide page above.

Protips - Better/Shorter Code Strategies (Variables)

Not showing any new python, but showing moves and patterns you may not have thought of to shorten and clean up your code.

Many of these techniques involve leveraging variables to clean up the code.

Nobody Feel Bad About This v1 Code

I'm going to show you a v1 form of the code that is suboptimal and then how to write it better. Nobody should feel bad for writing the suboptimal form, as it's en easy enough path to go down by accident.

first_n() v1

> first_n()

first_n(s, n): Given a string s and an int n. The n parameter will be one of 1, 2, 3, 4. Return a string of the first n chars of s. If s is not long enough to provide n chars, return None.

'Python', 2 -> 'Py'
'Python', 3 -> 'Pyt'
'Py', 3 -> None

Here is v1 which works perfectly, but is not the best.

def first_n(s, n):
    if len(s) < n:
        return None
    if n == 1:
        return s[:1]
    if n == 2:
        return s[:2]
    if n == 3:
        return s[:3]
    if n == 4:
        return s[:4]

Strategy: Use The Var

When n is 1, want an 1 in the slice. When n is 2, want an 2 in the slice. How can we get that effect?

The variable n points to one of the values 1, 2, 3, 4 when this function runs.

Therefore, in your code, if there is a spot where you need a value that appears in a variable or parameter, you can just literally write that variable there.

def first_n(s, n):
    if len(s) < n:
        return None
    return s[:n]

Thew last line is the key.

If n is 1, uses 1 in the slice. If n is 2, it uses 2, and so on. Use the variable itself, knowing that at run-time, Python will evaluate the variable, pasting in whatever value it points to.

Strategy: Add Var

Your algorithm has a few values flowing through it. Pick a value out and put it in a well-named variable. Use the variable on the later lines. We talked about this a bit before, also known as "decomp by var".

1. Helps readability - the named variable helps all the later lines read out what they do.

2. In the style of divide-and-conquer, computing a value that is part of the solution and storing it in a variable - you have picked off and solved a smaller part of the problem, and then put that behind you to work on the next thing.

Add Var Example

> low_high()

[0, 7, -2, 12, 3, -3, 5] -> [0, 5, 0, 5, 3, 0, 5]

Given nums, a list of 1 or more numbers. We'll say the first number in the list is the lower bound, and the last number is the upper bound. Change the list so any number less than the lower bound is increased to the lower bound, and likewise for any number greater than the upper bound. Return the changed list. Use for/i/range to loop over the elements in the list. Use an add-var strategy to introduce variables for the bounds.

low_high() v1

Here is a solution that does not use add-var. This works, but I would never write it this way.

def low_high(nums):
    for i in range(len(nums)):
        if nums[i] < nums[0]:
            nums[i] = nums[0]
        if nums[i] > nums[len(nums) - 1]:
            nums[i] = nums[len(nums) - 1]
    return nums

Not especially readable: nums[i] > nums[len(nums) - 1]

low_high() v2 Add Var

Introduce variables low and high to store those intermediate values, using the variables on the later lines. This is much more readable.

def low_high(nums):
    low = nums[0]
    high = nums[len(nums) - 1]
    for i in range(len(nums)):
        if nums[i] < low:
            nums[i] = low
        if nums[i] > high:
            nums[i] = high
    return nums

Which is Easier To Write Correctly?

Which is easier to write correctly, knowing that you might sometimes forget a -1 or get < and > backwards.

v1:

if nums[i] > nums[len(nums) - 1]:
    nums[i] = nums[len(nums) - 1]

v2:

if nums[i] > high:
    nums[i] = high

The version building on the variable is (a) readable and (b) easier to write correctly the first time. Readability is not just about reading, it's about writing it with fewer bugs as you are typing.

On our lecture examples we do this constantly - pulling some intermediate value into a well-named variable to use on later lines.

Debugging - Vertical Stretch-Out Advice

Say your code is not working, and you are trying to fix it. There is some long line of code, and you are not sure if it is right. Try pulling a part of that computation out, looking at just that bit carefully, and storing it in a variable.

What was a long horizontal line, you stretch out vertically into an increased number of shorter lines, better able to concentrate on one part at a time.

Notice that v1 is short but wide. In contrast, v2 is longer because of the 2 added lines to set the variables, but more narrow since the later lines have less on therm.

Preface: Variable Set Default Pattern

This is a handy pattern to set a variable according to a boolean. (1) Initialize (set) the variable to its default, common value first. (2) Then an if-statement detects if we need to initialize it to a different value.

alarm = '8:00 am'
if is_weekend:
    alarm = ''
# Alarm is now set, one way or another.
# Lines below just use it.

Equally, the following form is fine, but the above is a little shorter and is commonly used.

if not is_weekened:
    alarm = '8:00 am'
else:
    alarm = ''

Strategy: Better/Shorter - Unify Lines

if case-1:
    lines-a
    ...
    ...

if case-2:
    lines-b
    ...
    ...

Suppose we have
case-1 .. solved by lines-A
case-2 .. solved by lines-B
But lines-A and lines-B are similar
Maybe did copy/paste lines-A and lines-B
Unify - re-structure the code so one set of lines works for both cases
Move the difference between lines-A and lines-B into a variable
Hard to describe in the abstract, let's look at an example

Aside: Copy / Paste

Copy/paste of some lines can be ok
Sometimes lines of code are very similar
e.g. Doctests
Common bug: paste lines in, but forget to update in some spot

What's better than having two sets of lines to solve something? One set of lines that solves both cases!

speeding() Example

> speeding()

speeding(speed, is_birthday): Compute speeding ticket fine as function of speed and is_birthday boolean. Rule: speed under 50, fine is 100, otherwise 200. If it's your birthday, the allowed speed is 5 mph more. Challenge: change this code to be shorter, not have so many distinct paths.

The code below works correctly. You can see there is one set of lines for the birthday case, and another set of similar lines for the not-birthday case. What exactly is the difference between these two sets of lines?

def speeding(speed, is_birthday):
    if not is_birthday:
        if speed < 50:
            return 100
        return 200
    
    # is birthday
    if speed < 55:
        return 100
    return 200

Unify Cases Solution

The 2 "speed" if-statements look really similar
They differ by the value in the if-test 50 vs. 55
Solution: introduce variable: limit
1. If/logic sets limit for the various cases
2. Unified code below just uses limit, works for all cases

speeding() Better Unified Solution

1. Set limit first. 2. Then unified lines below use limit, work for all cases.

def speeding(speed, is_birthday):
    limit = 50
    if is_birthday:
        limit = 55
    
    if speed < limit:
        return 100
    return 200

Example ncopies()

> ncopies()

ncopies: word='bleh' n=4 suffix='@@' ->

   'bleh@@bleh@@bleh@@bleh@@'

Change this code to be better / shorter. Look at lines that are similar - make a unified version of those lines.

ncopies(word, n, suffix): Given name string, int n, suffix string, return n copies of string + suffix. If suffix is the empty string, use '!' as the suffix. Challenge: change this code to be shorter, not have so many distinct paths.

Before:

def ncopies(word, n, suffix):
    result = ''
    
    if suffix == '':
        for i in range(n):
            result += word + '!'
    else:
        for i in range(n):
            result += word + suffix
    return result

ncopies() Unified Solution

Solution: use logic to set an ending variable to hold what goes on the end for all cases. Later, unified code uses that variable vs. separate if-stmt for each case. Alternately, could use the suffix parameter as the variable, changing it to '!' if it's the empty string.

def ncopies(word, n, suffix):
    result = ''
    ending = suffix
    if ending == '':
        ending = '!'
    
    for i in range(n):
        result += word + ending
    return result

(optional) match()

> match()

match(a, b): Given two strings a and b. Compare the chars of the strings at index 0, index 1 and so on. Return a string of all the chars where the strings have the same char at the same position. So for 'abcd' and 'adddd' return 'ad'. The strings may be of any length. Use a for/i/range loop. The starter code works correctly. Re-write the code to be shorter.

Before:

def match(a, b):
    result = ''
    if len(a) < len(b):
        for i in range(len(a)):
            if a[i] == b[i]:
                result += a[i]
    else:
        for i in range(len(b)):
            if a[i] == b[i]:
                result += a[i]
    return result

match() Unified Solution

def match(a, b):
    result = ''
    # Set length to whichever is shorter
    length = len(a)
    if len(b) < len(a):
        length = len(b)

    for i in range(length):
        if a[i] == b[i]:
            result += a[i]

    return result

Reminder Avoid: `== True, == False`

# have a "is_raining" boolean var


if is_raining == True:  # NO do not write this
    ...


if is_raining:          # YES, this way
   ...

Boolean Expressions

See the guide for details Boolean Expression

Boolean operators: and or not
Mixture of these, can add parenthesis to force order of operation
"precedence" in CS parlance
Say have three boolean variables, each True/False
age - say age is good if less than 30
is_raining - True if raining
is_weekend- True if it's the weekend
Define: to be a good day, need two things:
1. it must not be raining
2. then either age is under 30 or it's the weekend

The code below looks reasonable, but doesn't quite work right

def good_day(age, is_weekend, is_raining):
    if not is_raining and age < 30 or is_weekend:
        print('good day')

Boolean Precedence:

not = highest, (like - in -7)
and = next highest (like *)
or = lowest (like +)

What The Above Does

Because and is higher precedence than or as written above, the code above acts like the following (and evaluates before or):

   if (not is_raining and age < 30) or is_weekend:

You can tell the above does not work right, because any time is_weekend is True, the whole thing is True, regardless of age or rain. This does not match the good-day definition above, which requires that it not be raining.

Boolean Precedence Solution

The solution we will spell out is not difficult.

Many programmers do not have boolean precedence memorized .. fine
Do remember that "not" is the highest precedence
Solution: note when you have a mixture of and + or
When there is a mixture, the precedence will matter
put in parenthesis to set the order you want
We will never complain about extra parenthesis, so add them to spell out the order you want
In this case, put parens to group the or part, separating from not-raining
BTW similar logic applies to math - if there's a mixture of * and +, add parenthesis

Solution

def good_day(age, is_weekend, is_raining):
    if not is_raining and (age < 30 or is_weekend):
        print('good day')

Boolean Exercise oh_no()

(Got this far in lecture - exercise TBD)

> oh_no()