Today: find()/slice practice, text files, file-reading, crazycat example program

## find() + slice exercises, no loops

• No Loops: here .find() does the looping for us
• Sketch diagram, plot out index numbers, slices
• Work 1 or 2 of these in lecture, can try them on your own

> find() + slice exercises

## Aside: Mental Model Skill

• We're living on this spectrum:
program-goal .. algorithm .. code .. run-output
• Normal situation: run-output is wrong, stare at code
• Aka debugging
• Not the best:
Randomly add +1 -1 in code, in hope it works
The "paramecium" strategy .. trying every random direction
• Better:
Look at the code
Think through a mental model of what lines do
e.g. "here x must point to the left bracket in s"
Make a diagram
• Looking at a line and thinking through what it does is a skill
• Practice that skill
• Demonstrated on the problems below

## Aside: Off By One Error

Int indexing into something is very common in computer code. So of course doing it slightly wrong is very common as well. So common, there is a phrase for it - "off by one error" or OBO — it even has its own wikipedia page. You can feel some kinship with other programmers each time you stumble on one of these.

"This code is perfect! Why is this not working. Why is this not work ... oh, off by one error. We meet again."

## right_left()

`'aabb' -> 'bbbbaaaa'`

We'll say the midpoint of a string is index len // 2, dividing the string into a left half before the midpoint and a right half starting at the midpoint. Given string s, return a new string made of 2 copies of right followed by 2 copies of left. So 'aabb' returns 'bbbbaaaa'.

Solution

```def right_left(s):
mid = len(s) // 2
left = s[:mid]
right = s[mid:]
return right + right + left + left
# Style comparison:
# Without using any variables, the solution
# is longer and not so readable:
# return s[len(s) // 2:] + s[len(s) // 2:] + s[:len(s) // 2] + s[:len(s) // 2]
```

Notice the decomp-by-var strategy: break the computation into smaller, named parts. A big improvement.

Lesson: if you have a line you cannot get working. Can you divide into a few lines, each a sub-part? This is our old divide-and-conquer strategy, working at the level of lines.

## at_3()

This looks simple, but the details are tricky. Make a drawing.

Given string s. Find the first '@' within s. Return the len-3 substring immediately following the '@'. Except, if there is no '@' or there are not 3 chars after the @, return ''.

Suggestion: what is the index of the last char we want to pull out? Is that index beyond the valid chars in s, then the string is not long enough and we return the empty string.

Solution

```def at_3(s):
at = s.find('@')
if at == -1:
return ''
# Is at + 3 past end of string?
# Could "or" combine with above
if at + 3 >= len(s):
return ''
return s[at + 1:at + 4]
# Working out >= above ... drawing!
```

## parens

This is nice, realistic string problem with a little logic in it.

s.find() variant with 2 params: `s.find(target, start_index)` - start search at start_index vs. starting search at index 0.

Given string s. Look for a '(.....)' within s - look for the first '(' in s, then the first ')' after the '(', using the second start_index parameter of .find(). If both parens are found, return the chars between them, so `'xxx(abc)xxx'` returns `'abc'`. If no such pair of parens is found, return the empty string. Think about the input `'))(abc)'`

Solution

```def parens(s):
left = s.find('(')
if left == -1:
return ''
# Start search at left + 1:
right = s.find(')', left + 1)
if right == -1:
return ''
# Use slice to pull out chars between left/right
return s[left + 1:right]
```

## print(), Files, Standard Out

See guide: Python Print for details about print() and standard out.

## Standard Out

• Text area associated with run of a program
• print() function sends data to it
• Standard out appears in the terminal directly
• See Python Print

## crazycat example

We'll use the crazycat example to demonstrate files, file-processing, and printing.

## What is a Text File?

• "text file", aka "plain text file"
• Extremely common way to store/exchange data on computers
• Very old (teletype) .. and used up through today
• Text file is a series of lines
• Each line is a series of chars ending with a `'\n'` char
• Special char: `'\n'` is called the "newline" char
• `'\n'` is like hitting the "return" or "enter" key on your keyboard
• Aside: a few other chars can appear instead of `'\n'`
Sequence of \r\n (DOS) \r (very old Mac)
Mostly Python insulates you from the exact line-ending of a text file

## Backslash Chars in a String

Use backslash \ to include special chars in a string

```s = 'isn\'t'
# or use double quotes
# s = "isn't"

\n  newline char
\\  backlash char
\'  single quote
\"  double quote
```

## hibye.txt Text Example

2 lines, each line has a `'\n'` at the end. The first line has a space, aka `' '`, between the two words.

```Hi and
bye
```

Here is what that file looks like in an editor that shows little gray marks for the space and \n

In Fact the contents of that file can be expressed as a Python string
`'Hi and\nbye\n'`

How many chars are in that file (each \n is one char)? Roman alphabet A-Z chars like this take up 1 byte per char. This comes to 11 chars. Look in your file-system explorer on your computer, get-info on the file. See if it's 11 bytes in size.

So when you send a 50 char text message .. that's about 50 bytes sent on the network + some overhead. Text uses very few bytes compared to sound or images or video.

## print() function

• Python print() function
• Prints text lines to the standard output area
• Takes a number of items, separated by commas
• Converts each item to string form
• Places a `'\n'` at the end of the line
• sep='xx' option - use to separate items
• end='xx' option - put this at end instead of '\n'
• use end='' to put nothing at the end
• Try print() in the interpreter, see its output there
```>>> print('hello', 'there', '!')
hello there !
>>> print('hello', 123, '!')
hello 123 !
>>> print('hello', 123, '!', sep=':')
hello:123:!
>>> print(1, 2, 3)               # end='\n' the default
1 2 3
>>> print(1, 2, 3, end='xxx\n')  # end= what goes at end
1 2 3xxx
>>> print(1, 2, 3, end='')       # suppress the \n
1 2 3>>>
```

## Data out of function: return vs. print

Return and print() are both ways to get data out of a function, so they can be confused with each other. We will be careful when specifying a function to say that it should "return" a value (most common), or it should "print" something to standard output. Return is the most common way to communicate data out of a function, but below are some print examples.

## 1. Try "ls" and "cat" in terminal

Open a terminal in the crazycat directory (see the Command Line guide for more information running in the terminal). Terminal commands - work in both Mac and Windows. When you type command in the terminal, you are typing command directly to the operating system that runs your computer - Mac OS, or Windows, or Linux.

`pwd` - print out what directory we are in

`ls` - see list of filenames ("dir" on older Windows)

`cat *filename*` - see file contents ("type" on older Windows)

```\$ ls
alice.txt       crazycat.py     hibye.txt       poem.txt        quotes
\$ cat poem.txt
Roses Are Red
Violets Are Blue
This Does Not Rhyme
\$
```

## 2. Run crazycat.py with filename

• It does "cat" but implemented in Python
Demonstrating how to read lines of a text file and print them out
• Use "tab" to autocomplete filenames
• The standard out of a program is printed to the terminal
Each print() just shows up here
```\$ python3 crazycat.py poem.txt
Roses Are Red
Violets Are Blue
This Does Not Rhyme
\$ python3 crazycat.py hibye.txt
Hi and
bye
\$
```

Here is the canonical file-reading code:

```with open(filename) as f:
for line in f:
# use line in here
```

Visualization of how the variable "line" behaves for each iteration of the loop:

• Read series of lines of a file
• for loop - treats file like a collection of line strings
Each run of the loop body gets the next line of text
e.g. 4 line file = loop body runs 4 times
• Memory efficient
Only holds one line in memory at a time
• Each line string has the `'\n'` at its end
• other forms of open():
`open(filename)` - open for reading
`open(filename, 'r')` - same as above, 'r' denotes reading
`open(filename, 'w')` - open for writing
`open(filename, encoding='utf-8')` - specify unicode encoding (later)

## print_file_plain()

Here is the working function to print the contents of a file. Why do we need `end=''` here? The line already has `\n` at its end, so get double spacing if print() adds its standard `\n`

```def print_file_plain(filename):
with open(filename) as f:
for line in f:
# use line in here
print(line, end='')
```

## Optional: Run With -crazy Command Line Option

• Note: main() function looks for '-crazy' string
We will learn how that works soon
• Look at the code for the crazy_line() function
• Swaps upper/lower
• String/loop type code we've done before
• Returns computed result
Normal black-box design
Uses return (not print)
Has Doctests
```def crazy_line(line):
"""
Given a line of text, returns a "crazy" version of that line,
where upper/lower case have all been swapped, so 'Hello'
returns 'hELLO'.
>>> crazy_line('Hello')
'hELLO'
>>> crazy_line('@xYz!')
'@XyZ!'
>>> crazy_line('')
''
"""
result = ''
for i in range(len(line)):
char = line[i]
if char.islower():
result += char.upper()
else:
result += char.lower()
return result
```

Here is command line to run with -crazy option

```\$ python3 crazycat.py -crazy poem.txt
rOSES aRE rED
vIOLETS aRE bLUE
tHIS dOES nOT rHYME
```
• Provided main() has if-logic looks for '-crazy'
calls print_file_crazy() below
• Key line in there:
`print(crazy_line(line), end='')`

Here is print_file_crazy(), similar to print_file_plain() but passes each line through the crazy_line() function before printing.

```def print_file_crazy(filename):
"""
Given a filename, read all its lines and print them out
in crazy form.
"""
with open(filename) as f:
for line in f:
print(crazy_line(line), end='')
```

## int str Types and Conversion

A loose end we'll need cleared up soon. See guide Python String

What is the difference between 123 and '123'? How do they work with the `+` operator?

• "type" of a value is its formal category, e.g. int or string
• `123` is an integer, type is `int`
• `'123'` is a string, a series of chars, type is `str`
• Values in Python are tagged by their type
• Operators like `+` use this type information
• Type name also works for conversion
`int() and str()`
```>>> a = 123
>>> b = 5
>>> a + b
128
>>>
>>> a = 'hi'
>>> b = 'there'
>>> a + b
'hithere'
>>>
>>> # e.g. line is out of a file - a string
>>> # convert str form to int
>>> line = '123\n'
>>> int(line)
123
>>>
>>> # works the other way too
>>> str(123)
'123'
```