July 7th, 2021
Today: string in, string upper/lower, if/elif, string .find(), slices
See chapters in Guide: String - If
in
Test>>> 'c' in 'abcd' True >>> 'bc' in 'abcd' # works for multiple chars True >>> 'bx' in 'abcd' False >>> 'A' in 'abcd' # upper/lower are different False
Python has many built-in functions, and we will see all the important ones in CS106A. You want to know the common built-in functions, since using a built-in is far preferable to writing code for it yourself - "in" is a nice example. The "in" test works for several data structure to see if a value is in there. This is our first example with strings.
has_pi(s): Given a string s, return True if it contains the substrings '3' and '14' somewhere within it, but not necessarily together. Use "in".
> has_pi()
Note these functions are in the string-3 section on the Experimental server
This form works in English but not in Python:
if '3' and '14' in s: # NO does not work ...
The and should connect two fully formed boolean tests, such as you would write with "in" or "==", so this works
if '3' in s and '14' in s: ...
>>> 'a'.isalpha() True >>> 'cat'.isalpha() True >>> '5'.isalpha() False >>> '5'.isdigit() True >>> '@'.isalpha() False
>>> 'Kitten123'.upper() # return with all chars in upper form 'KITTEN123' >>> 'Kitten123'.lower() 'kitten123' >>> >>> 'a'.islower() True >>> 'A'.islower() False >>> 'A'.isupper() True >>> '@'.islower() False >>> 'a'.upper() 'A' >>> 'A'.upper() 'A' >>> '@'.upper() '@' >>> s = 'Hello' >>> s.upper() # Returns uppercase form of s 'HELLO' >>> >>> s # Original s unchanged 'Hello' >>>
'12abc34' -> 'ABC'
Given string s. Return a string made of all the alphabetic chars in s, converted to uppercase form.
Use string functions .isalpha() and .upper()
Solution
def alpha_up(s): result = '' for i in range(len(s)): if s[i].isalpha(): result += s[i].upper() return result
> catty()
'xCtxxxAax' -> 'CtAa'
Return a string made of the chars from the original string, whenever the chars are one of 'c' 'a' 't'
, (either lower or upper case). So the string 'xaCxxxTx'
returns 'aCT'
. (Had an earlier version of this function that was case-sensitive.)
Here is a natural way to think of the code, but it does not work:
def catty(s): result = '' for i in range(len(s)): if s[i] == 'c' or s[i] == 'a' or s[i] == 't': result += s[i] return result
What is the problem? Upper vs. lower case. We are not getting any uppercase chars 'C'
for example.
Solution: convert each char to lowercase form .lower()
, then test.
Solution - this works, but that if-test is ugly
def catty(s): result = '' for i in range(len(s)): if s[i].lower() == 'c' or s[i].lower() == 'a' or s[i].lower() == 't': result += s[i] return result
Create variable to hold the lengthy computation - shorter code and "reads" better.
low = s[i].lower()
The complete solution
def catty(s): result = '' for i in range(len(s)): low = s[i].lower() # decomp by var if low == 'c' or low == 'a' or low == 't': result += s[i] return result
Style aside: the name of a variable should remind us of the role of that data in the local code. Other than that, the name can be short. The name does not need to repeat every true thing about the value. Just enough to distinguish it from other values in this algorithm.
Good names, short but including essential details: low
, low_char
Names with more detail, probably too long: low_char_i
, low_char_in_s
The V2 code above is acceptable, but V3 is shorter and nicer. The V3 code also runs faster, as it does not compute the lowercase form three times per char.
if test1: action-1 elif test2: action-2 else: action-3
str_adx(s): Given string s. Return a string of the same length. For every alphabetic char in s, the result has an 'a', for every digit a 'd', and for every other type of char the result has an 'x'. So 'Hi4!x3' returns 'aadxad'. Use an if/elif structure.
Solution
def str_adx(s): result = '' for i in range(len(s)): if s[i].isalpha(): result += 'a' elif s[i].isdigit(): result += 'd' else: result += 'x' return result
>>> s = 'Python' >>> s.find('t') 2 >>> s.find('th') 2 >>> s.find('n') 5 >>> s.find('x') -1 >>> s.find('N') -1
>>> s = 'Python' >>> s[1:3] # 1 .. UBNI 'yt' >>> s[1:5] 'ytho' >>> s[4:5] 'o' >>> s[4:4] # "not including" dominates ''
>>> s[:3] # omit = from/to end 'Pyt' >>> s[4:] 'on' >>> s[4:999] # too big = through the end 'on' >>> s[:4] # "perfect split" on 4 'Pyth' >>> s[4:] 'on' >>> s[:] # the whole thing 'Python'
A first venture into using index numbers and slices. Many problems work in this domain - e.g. extracting all the hashtags from your text messages.
'cat[dog]bird' -> 'dog'
> brackets
Solution
def brackets(s): left = s.find('[') if left == -1: return '' right = s.find(']') # Use slice to pull out chars between left/right # make a drawing! return s[left + 1: right]
The variables left
and right
make this code more readable. They work naturally in the drawing and in the code, naming an important intermediate value that runs through the computation.
Below is what the code looks like without the variable. It works fine, and it's one line shorter, but the readability is a worse. It also likely runs a little slower, as it computes the left-bracket index twice.
def brackets(s): if s.find('[') == -1: return '' return s[s.find('[') + 1: s.find(']')]
Our brackets solution with its variables looks better.
Int indexing into something is extremely common in computer code. So of course doing it slightly wrong is very common as well. So common, there is a phrase for it - "off by one error" or OBO — it even has its own wikipedia page. You can feel some kinship with other programmers each time you stumble on one of these.
"My code is perfect! Why is this not working? Why is this not work ... oh, off by one error. We meet again!"
Here is a problem similar to brackets for you to try. If we have enough time in lecture, we'll do it in lecture. A drawing really helps the OBO on this one.
> at_3
>>> s = 'Python' >>> s[len(s)-1] 'n' >>> s[-1] # -1 is the last char 'n' >>> s[-2] 'o' >>> s[-3] 'h' >>> s[1:-3] # works in slices too 'yt' >>> s[-3:] 'hon'