Today: find()/slice practice, text files, file-reading, crazycat example program

find() + slice exercises, no loops

> find() + slice exercises

Aside: Mental Model Skill

Aside: Off By One Error

Int indexing into something is very common in computer code. So of course doing it slightly wrong is very common as well. So common, there is a phrase for it - "off by one error" or OBO — it even has its own wikipedia page. You can feel some kinship with other programmers each time you stumble on one of these.

"This code is perfect! Why is this not working. Why is this not work ... oh, off by one error. We meet again."

right_left()

'aabb' -> 'bbbbaaaa'

We'll say the midpoint of a string is index len // 2, dividing the string into a left half before the midpoint and a right half starting at the midpoint. Given string s, return a new string made of 2 copies of right followed by 2 copies of left. So 'aabb' returns 'bbbbaaaa'.

Solution

def right_left(s):
    mid = len(s) // 2
    left = s[:mid]
    right = s[mid:]
    return right + right + left + left
    # Style comparison:
    # Without using any variables, the solution
    # is longer and not so readable:
    # return s[len(s) // 2:] + s[len(s) // 2:] + s[:len(s) // 2] + s[:len(s) // 2]

Notice the decomp-by-var strategy: break the computation into smaller, named parts. A big improvement.

Lesson: if you have a line you cannot get working. Can you divide into a few lines, each a sub-part? This is our old divide-and-conquer strategy, working at the level of lines.

at_3()

This looks simple, but the details are tricky. Make a drawing.

Given string s. Find the first '@' within s. Return the len-3 substring immediately following the '@'. Except, if there is no '@' or there are not 3 chars after the @, return ''.

Suggestion: what is the index of the last char we want to pull out? Is that index beyond the valid chars in s, then the string is not long enough and we return the empty string.

Solution

def at_3(s):
    at = s.find('@')
    if at == -1:
        return ''
    # Is at + 3 past end of string?
    # Could "or" combine with above
    if at + 3 >= len(s):
        return ''
    return s[at + 1:at + 4]
    # Working out >= above ... drawing!

parens

This is nice, realistic string problem with a little logic in it.

s.find() variant with 2 params: s.find(target, start_index) - start search at start_index vs. starting search at index 0.

Given string s. Look for a '(.....)' within s - look for the first '(' in s, then the first ')' after the '(', using the second start_index parameter of .find(). If both parens are found, return the chars between them, so 'xxx(abc)xxx' returns 'abc'. If no such pair of parens is found, return the empty string. Think about the input '))(abc)'

Solution

def parens(s):
    left = s.find('(')
    if left == -1:
        return ''
    # Start search at left + 1:
    right = s.find(')', left + 1)
    if right == -1:
        return ''
    # Use slice to pull out chars between left/right
    return s[left + 1:right]

print(), Files, Standard Out

See guide: Python Print for details about print() and standard out.

See guide: Python File for details about file reading and writing.

Standard Out

crazycat example

We'll use the crazycat example to demonstrate files, file-processing, and printing.

crazycat.zip

What is a Text File?

Backslash Chars in a String

Use backslash \ to include special chars in a string

s = 'isn\'t'
# or use double quotes
# s = "isn't"

\n  newline char
\\  backlash char
\'  single quote
\"  double quote

hibye.txt Text Example

2 lines, each line has a '\n' at the end. The first line has a space, aka ' ', between the two words.

Hi and
bye

Here is what that file looks like in an editor that shows little gray marks for the space and \n

alt: hibye.txt chars, showing \n ending each line

In Fact the contents of that file can be expressed as a Python string
'Hi and\nbye\n'

How many chars are in that file (each \n is one char)? Roman alphabet A-Z chars like this take up 1 byte per char. This comes to 11 chars. Look in your file-system explorer on your computer, get-info on the file. See if it's 11 bytes in size.

So when you send a 50 char text message .. that's about 50 bytes sent on the network + some overhead. Text uses very few bytes compared to sound or images or video.

print() function

>>> print('hello', 'there', '!')
hello there !
>>> print('hello', 123, '!')
hello 123 !
>>> print('hello', 123, '!', sep=':')
hello:123:!
>>> print(1, 2, 3)               # end='\n' the default
1 2 3
>>> print(1, 2, 3, end='xxx\n')  # end= what goes at end
1 2 3xxx
>>> print(1, 2, 3, end='')       # suppress the \n
1 2 3>>>

Data out of function: return vs. print

Return and print() are both ways to get data out of a function, so they can be confused with each other. We will be careful when specifying a function to say that it should "return" a value (most common), or it should "print" something to standard output. Return is the most common way to communicate data out of a function, but below are some print examples.

Crazycat Program example

crazycat.zip

1. Try "ls" and "cat" in terminal

Open a terminal in the crazycat directory (see the Command Line guide for more information running in the terminal). Terminal commands - work in both Mac and Windows. When you type command in the terminal, you are typing command directly to the operating system that runs your computer - Mac OS, or Windows, or Linux.

pwd - print out what directory we are in

ls - see list of filenames ("dir" on older Windows)

cat *filename* - see file contents ("type" on older Windows)

$ ls
alice.txt       crazycat.py     hibye.txt       poem.txt        quotes
$ cat poem.txt 
Roses Are Red
Violets Are Blue
This Does Not Rhyme
$

2. Run crazycat.py with filename

$ python3 crazycat.py poem.txt 
Roses Are Red
Violets Are Blue
This Does Not Rhyme
$ python3 crazycat.py hibye.txt 
Hi and
bye
$

3. Canonical File-Read Code

Here is the canonical file-reading code:

with open(filename) as f:
    for line in f:
        # use line in here

Visualization of how the variable "line" behaves for each iteration of the loop:

alt:file read loop, gets one line at a time from file

print_file_plain()

Here is the working function to print the contents of a file. Why do we need end='' here? The line already has \n at its end, so get double spacing if print() adds its standard \n

def print_file_plain(filename):
    with open(filename) as f:
        for line in f:
            # use line in here
            print(line, end='')

Optional: Run With -crazy Command Line Option

def crazy_line(line):
    """
    Given a line of text, returns a "crazy" version of that line,
    where upper/lower case have all been swapped, so 'Hello'
    returns 'hELLO'.
    >>> crazy_line('Hello')
    'hELLO'
    >>> crazy_line('@xYz!')
    '@XyZ!'
    >>> crazy_line('')
    ''
    """
    result = ''
    for i in range(len(line)):
        char = line[i]
        if char.islower():
            result += char.upper()
        else:
            result += char.lower()
    return result

Here is command line to run with -crazy option

$ python3 crazycat.py -crazy poem.txt 
rOSES aRE rED
vIOLETS aRE bLUE
tHIS dOES nOT rHYME

Here is print_file_crazy(), similar to print_file_plain() but passes each line through the crazy_line() function before printing.

def print_file_crazy(filename):
    """
    Given a filename, read all its lines and print them out
    in crazy form.
    """
    with open(filename) as f:
        for line in f:
            print(crazy_line(line), end='')

int str Types and Conversion

A loose end we'll need cleared up soon. See guide Python String

What is the difference between 123 and '123'? How do they work with the + operator?

>>> a = 123
>>> b = 5
>>> a + b
128
>>> 
>>> a = 'hi'
>>> b = 'there'
>>> a + b
'hithere'
>>> 
>>> # e.g. line is out of a file - a string
>>> # convert str form to int
>>> line = '123\n'
>>> int(line)
123
>>> 
>>> # works the other way too
>>> str(123)
'123'