Mystery Of Code
- Code seems like the easiest thing in the world
- But it's deceptively difficult
- It's easy for an organization to create a mass of code that never works right
It keeps absorbing programmer hours but is never debugged
- We need to bring discipline, good style to create real working code
- CS106A has this in mind from the start
- Readable Code
- Eye sees the code text
- What the code does is apparent
- The code "reads"
Read each line
Follow the narrative idea
Helped by good var names, e.g.
Helped by good function names
- Why do we care?
- Fewer bugs!
- What is a bug?
- The code does something different from our intent
- i.e. looked at code, did not see what it actually did
- Techniques: good variable names, good function names, decomposition, spacing, comments
Good Variable Names - Readable 1.0
- Variable name = what value does this hold?
- The code is a story
- Do this, do this, do this
- Variable names label the values progressing through the story
- The payoff of readable code is right now
- e.g. left and right in the above example
- Tension: shorter var name, less space, easy to type
- Longer var names: better spell out is in the var
- Do not: spell out every true thing about the value
- Do: label concept sufficiently to distinguish from others in this function
Variable Names Pay Off Right Now
You are writing a 10 line function. You have data that moves, changes from line to line. You need to keep track of these in your own mind as you go from line to line. Good variable and function names are big help here.
Parens Example Code
Here some example code from a few lectures ago:
parens(s): Look for a '(.....)' within s -
look for the first '(' in s, then the
first ')' after it. If both are found,
return the chars between them, so
'xxx(abc)xxx' yields 'abc'. If
no such pair of parens is found, return the
left = s.find('(')
if left == -1:
right = s.find(')', left + 1)
if right == -1:
return s[left + 1:right]
Variable Names Could Have Used for "left"
- Identify the noun/role within this function
- Distinguish from the other nouns here
- Do not record every true thing about it .. too long
int_index_of_left_paren # Too long.
# Do not spell out
# every true thing.
index_of_left_paren # Too long.
left_index # fine
left # fine
li # too short/cryptic
l # too short, and don't use "l"
Exceptions: Idiomatic 1 Letter / Short Var Names
- There's a few idiomatic 1 letter names:
s - idiomatic generic string
ch - idiomatic for single char in string
i, j - idiomatic index loop: 0, 1, 2, ... max-1
n - idiomatic generic int value
x, y - idiomatic x, y 2-d coordinates
f - idiomatic opened file
lst - idiomatic list variable (soon)
d - idiomatic dict variable (soon)
- Never name a variable lowercase
O - look like digits 1 0
- Notice that the 1-letter name "s" is fine in above fn
There is nothing semantic about s we are trying to keep track of
Decomp By Var Strategy
- You have something complicated to compute
- Could write it as one big line
- Instead, break it into separate lines
- Store partial results in variables as you go
- This a form of divide and conquer!
- Use variables to take on the problem piece by piece
- Breaking a long horizontal line into a vertical steps
- Lecture examples very frequently decomp by var like this
- Remember this strategy if you have a long line and cannot get it working
Decomp By Var Example Problem 'x3412y'
This is a classic make-a-drawing index problem. Getting this perfect is not so easy.
Function: Given a string s of even length, if the string length is 2 or less,
return it unchanged.
Otherwise take off the first and last chars.
Consider the remaining middle piece.
Split the middle into front and back halves.
Swap the order of these two halves, and return
the whole thing with the first and last chars
So 'x1234y' returns 'x3412y'.
Decomp By Var Solution
The variable names here help us keep the various parts clear through the narrative, even at the moment we are working out each line. The variable
names are naturally similar to those in the specification.
if len(s) <= 2:
first = s
last = s[len(s) - 1]
mid = s[1:len(s) - 1]
halfway = len(mid) // 2
return first + mid[halfway:] + mid[:halfway] + last
The variable names don't have to be super detailed. Just enough to label the concepts through this narrative. Note that the one letter "s" is fine - there is nothing semantic about s that we need to keep track of beyond it's a string. In contrast, "first" "last" etc. have specific roles in the algorithm.
Point here: writing this function with a blank screen. Use good variable names to pick off and name parts of the problem as you work ahead.
The variables are sort of divide-and-conquer within the function - separate out and name individual steps of the algorithm vs. doing it in 1 big jump.
Bad Variable Name Solution
Here is the above function written without any good variables. Just because something is 1 line, does not make it better. I believe it's correct, but it' hard to tell!
This is a good example of not readable.
if len(s) <= 2:
return (s + s[1:len(s) - 1][(len(s) - 2) // 2:] +
s[1:len(s) - 1][:(len(s) - 2) // 2] + s[len(s) - 1])
The bad code also repeats computations, like
(len(s) - 2) // 2. The good solution computes that value once and stores it in the variable
halfway for use by later lines.
Big Picture Software Costs - N2
- A software project might be planned to take 2 months
- And 2 years later, it still doesn't really work
- How is that possible?
- It all comes down to n-squared
N Squared Trap
- The central insight that drives program design
- Decomposition is a result of this fact
- Question: how much work is 500 line program vs. a 1000 line program?
- How many hours does it take as the number of line goes up?
- Intuitive but wrong guess - it goes up linearly
- CS experience: it's much worse than that
- Difficulty goes up as the square of number of lines
- It's a concave-up curve
What Does N-Squared Mean?
- Number of lines too large .. never enough time to debug it
- Dp not write a 1000 line program
- We can write a series of 20 line programs
- Decomposition is about getting to the left in that picture
- A series of functions, each with just a few lines
Decomposition - Escape N-Squared Trap
- Decompose program into separate functions
- Never have all the lines in your head at one time
- Escape the N-Squared trap
- "Abstraction" - the contract a function presents to callers - simple
AKA the in/out contract of the function
- "Implementation" - the internal code of a function - complicated
Black Box Model - Abstraction vs. Implementation
- "Black Box" model of a function
- 1. "Abstraction"
- External contract what this function does
- What goes in? (the params)
- What comes out? (the return value)
- What does it compute given its params?
- """what is in the tripe-quote string"""
- 2. Implementation Details
All the code inside the function, complicated
The word "detail" is used here
- Q: Does the caller need to know the internal details of the function?
- A: No!
- The strategy is to hide "implementation detail" inside the function
- Calling a function, just need to know what it accomplishes
- Calling a function is simple relative to its internal details
Abstraction vs. Implementation - Ride To Airport
- What does a function accomplish for its caller?
- Aka "the contract" of the function
- Aka the "abstraction" of the function
- Lots of details of implementation are not in the contract:
- How it works internally, how many loops ..
- The caller does not need to know about that
- A nice simplification
- Ride to airport abstraction:
Pick up time and place
Drop off time and place
Ride shared with others
- Ride to airport implementation details, don't care about details:
Car has LED headlights?
Color of the seats?
Is the driver wearing a hat?
Is the gas tank more than 1/2 full
.. we care about drop off, which covers the detail about having enough gas to get there
- The word "detail" is often used for the implementation
- The point: abstraction is much simpler than implementation
- Calling a function should be easy
Strategy - Within Each Function
- Remember big picture
Avoid having all N lines in your head at once
- 1. Work on function1()
Look at function1 params
Work on function1 implementation to return result
Here, must understand function1 implementation
- 2. Work on function2()
Look at function2 params
Work on function2 implementation
Call function 1, get back result (abstraction)
Do not think about function1 implementation
- With each function, concentrate on just its implementation
- Build on other function abstractions
- Shielded from other function implementation complexity
- aka work on 1-function-at-a-time
- Escape the n-squared trap, never think about all the lines at once
Abstraction in CS
- Working bigger problems
- You will constantly call some function you did not write
- Depend on its abstraction, not worrying about its implementation
- It is hard to overstate how much we depend on this pattern to build computer systems
# get list of filesnames in named directory
filenames = os.listdir('Downloads')
# Get the current date and time
now = datetime.now()
Mechanics: Fn name, PyDoc, Doctests
- 1. Have a good verb function name
capture what it does, so calling code "reads" nicely
e.g. is_url_sketchy(), delete_files()
- 2. List of params with good names - the inputs
- Given these inputs, computes and returns what?
- What does this function do?
- i.e. the "contract" the function provides to its callers
- Summarize the contract within Pydoc """triple quotes"""
Given params X Y Z
- We've seen this many times
- Can delete the ":param s: " stuff PyCharm puts in, not needed at this level
- The Doctests are another way to express the contract
def del_chars(s, target):
Given string s and a "target" string,
return a version of s with all chars that
appear in target removed, e.g. s 'abc'
with target 'bx', returns 'ac'.
(Not case sensitive)
>>> del_chars('abC', 'acx')
>>> del_chars('ABc', 'aCx')
>>> del_chars('', 'a')
result = ''
target = target.lower()
for i in range(len(s)):
if s[i].lower() not in target:
result += s[i]
# could use "for char in s" form, since not using index
Coding Advice - vs. Anti Pattern
Divide this program into separate functions (black box). Test the helper functions first. When they are working, then test the bigger function that calls them. Not the other order!
Anti-pattern: type in all the code. Debug nothing. When it's all typed in, run it and try to debug all the code concurrently. Never do this!
- We have used for/range to index into string
- This highlighted the role of index numbers in string algorithms
Which I wanted to do!
- But to be honest, there's an easier way to iterate over all the chars in a string
for ch in s:
Loops over all the chars in s, left to right
You do not get the index number here, just the char
No need to use square brackets in this form [ ]
- The variable name
char is idiomatic for a variable holding a single char
- Use this form if you do need access to index numbers
- Use the for/range form if you need access to index numbers
String Foreach Examples
String Foreach examples
double_char2() example with foreach
result = ''
for ch in s:
result = result + ch + ch
- "list" type stores a linear collection of any type of python value
- Use list to store many of something
e.g. a thousand urls - a list of url strings
e.g. a million temperature readings - a list of float values
- Things in a list called "elements"
- Theme: python tries to be uniform: len() square brackets, list works the same as string
- "lst" is a generic list variable name
1. Use square brackets [..] to write a list in code (a "literal" list value), separating elements with commas
>>> lst = ['a, 'b', 'c']
"empty list" is just 2 square brackets with nothing within:
2. Use len() - number of elements
3. Use square brackets to access an element in a list (bad index err possible)
Error:list index out of range
The big difference from strings is that lists are mutable - lists can be changed. Elements can be added, removed, changed over time.
1. List Append
- Lists can contain any type (today int, str)
lst.append('something') - adds an elem to end of list
- Modifies the list, returns nothing
- Common list-build pattern:
# 1. make empty list, then call .append() on it
>>> lst = 
['a', 'b', 'c']
# 2. Similar, using loop/range to call .append()
>>> lst = 
>>> for i in range(6):
... lst.append(i * 10)
[0, 10, 20, 30, 40, 50]
2. List "in" / "not in" Tests
- The in operator tests if a value is in a list
- not in works too, reads nicely
- (again, very analogous to string)
>>> lst = ['a', 'b', 'c']
>>> 'c' in lst
>>> 'x' in lst
>>> 'x' not in lst # this form is preferred
>>> not 'x' in lst # equivalent to above
3. Foreach On List
- for "foreach" loop works to loop over elements in a list
- This is a common code pattern, since looking at all the elements is a common problem
- No Change do not change the list - add/remove/change - during iteration
Kind of reasonable rule: how would iteration work if elements left and appeared in the midst of iteration
>>> lst = ['a', 'b', 'c']
>>> for s in lst:
... # use s in here
3 list.index(target) - Find Index of Target
- Somewhat similar to str.find()
list.index(target) - returns int index of target if found
- Problem: only works if target is in the list
- Code should check with
in first, only call .index() if in is True
- This design is annoying
It would be easier if .find() just returned -1, but it doesn't
list.index(target, start_index) - begin search at start_index instead of 0
>>> lst = ['a', 'b', 'c']
ValueError: 'd' is not in list
>>> 'd' in lst
>>> 'c' in lst
List Code Examples
- 1. list1() - create list [1, 2, 3, ..n] - use range and append
- 2: list100() - create [100, 101, 102 ..]
- 3: list_censor() - similar but filter out values in "censor" list, use in
- 4. post_donut() - use .index() to find something
Constants in Python
STATES = ['CA, 'NY', 'NV', 'KY', 'OK']
- Simple form name=value at far left, not within a def
- This is a type of "global" variable
- A variable not inside a function
- In this case it's in effect a constant
- Functions can just refer to STATES to get its value
- Convention: upper case means its a de-facto constant
- Best style: a read-only value, don't modify
- Python does not enforce this for us
- Modified global variables are iffy style, we don't do it
- Can have "global" declaration
We'll never do this, enables read/write that we do not do
# provided ALPHABET constant - list of the regular alphabet
# in lowercase. Refer to this simply as ALPHABET in your code.
# This list should not be modified.
ALPHABET = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
for ch in ALPHABET: # this works
main() - Monday
- Need to show you how to write a main()
- Uses lists
- Last bit of HW4