Today: realistic challenging parse/loop examples, while-True if-break

Iowa Caucus App Debacle

Ripped from the headlines: Iowa App Debacle. Lots of press about a failed software project. What is the lesson here? Have more time to finish? That is not the real insight.

Iron Triangle of Project Planning

There are 3 dimensions trying to complete a project. You specify 2, but then the third is determined by the iron triangle. Or in short "pick 2". There are various phrasings, but here is a version with terms for a software project:

Suppose you have a long, neat list of features. And you have not enough time. What happens? Quality comes in terrible.

Same scenario, and you are the decision make, and it's vital that quality is high. What do you do? Cut down the feature list. Make the list of features more lame, but with a chance of good quality. Put another way: there is a tradeoff between features and quality when time is fixed.

In the Iowa case, I've read that it was supposed to OCR pictures of the voting sheets. Well, if the schedule slips and there's only a month left, drop that feature and just do the critical pieces. Process the pictures by hand or something later, or maybe there's no pictures at all. Triage down what needs to get done with a clear eye on the schedule.

Leadership is owning the decision to make a less-featured, more lame app in order to hit the time/quality benchmarks. It's a trap to keep the feature list up and pray that the quality will break you way this time. That way lays the rocky shoals of software project disaster!

Strings and Return - Q1

We have written many black-box functions: input = parameters, output = return. Return is how all functions work, even built-in ones like str.upper(). How to use the return value? Consider these lines of code. What is s after this runs?

s = 'Hello'
s.upper()
# what is s now?

Solution

s is still 'Hello', it was never changed

# s.upper() **returns** the new value, but the code does not use it
# so it is dropped. That return values are dropped if not used is an
# unintuitive aspect of code

Change s - Q2

The function call, e.g. s.upper(), returns a result. The calling code needs to catch or use that result somehow.

Q2: How to write the above code to change s to be the upper case form?

Solution

s = s.upper()

# or could use a different variable to hold the changed form, leaving
# s with its original value
up = s.upper()

Last Time: all_brackets

From last-time. Try all_brackets() function.

> list/parse problems

Let's work out the code for all_brackets

Input case: 'xx[abc]xx[42]xx' -> ['abc', '42']

What is the code for all_brackets, starting with the following code? Standard CS106A steps - diagram an input case, introduce vars left + right on the diagram to work out the details. Run it.

def all_brackets(s):
    search = 0
    result = []
    while search < len(s):
        left = s.find('[', search)
        if left == -1:
            break

        # Your code here





    return result

all_brackets() Solution

def all_brackets(s):
    search = 0
    result = []
    while search < len(s):
        left = s.find('[', search)
        if left == -1:
            break
        # Your code here
        pass
        right = s.find(']', left)  # or left+1
        if right == -1:
            break
        result.append(s[left + 1:right])
        # Key: set up var at end of loop
        search = right  # or right+1
    return result

Suppose while search < len(s): Was Broken

Edge Cases - Two Skills

search = right + 1 ?


Data and Parsing

Here's some fun looking data...

$GPGGA,005328.000,3726.1389,N,12210.2515,W,2,07,1.3,22.5,M,-25.7,M,2.0,0000*70
$GPGSA,M,3,09,23,07,16,30,03,27,,,,,,2.3,1.3,1.9*38
$GPRMC,005328.000,A,3726.1389,N,12210.2515,W,0.00,256.18,221217,,,D*78
$GPGGA,005329.000,3726.1389,N,12210.2515,W,2,07,1.3,22.5,M,-25.7,M,2.0,0000*71
$GPGSA,M,3,09,23,07,16,30,03,27,,,,,,2.3,1.3,1.9*38
$GPRMC,005329.000,A,3726.1389,N,12210.2515,W,0.00,256.18,221217,,,D*79
$GPGGA,005330.000,3726.1389,N,12210.2515,W,2,07,1.3,22.5,M,-25.7,M,3.0,0000*78
$GPGSA,M,3,09,23,07,16,30,03,27,,,,,,2.3,1.3,1.9*38
...

Advancing Through String With var += 1

at_words() Example Functions

Series of very real-world string algorithms

> parse examples

at_words(s)

at_words(s): For each '@' in s, parse out the "word" substring of 1 or more alphabetic chars which immediately follow the '@', so '@abc @ @xyz' returns ['abc', 'xyz'].

at_words() #1 Find End of Alpha Chars

    end = at + 1
    while s[end].isalpha():
        end += 1

at_words() Bottom of loop

    word = s[at + 1:end]
    result.append(word)
    search = end

at_words() Bug #1

at_words() Case #2 - Zero Chars

at_words() Observations - Want vs. Don't-Want

at_words() Solution

def at_words(s):
    search = 0
    words = []
    while True:
        at = s.find('@', search)
        if at == -1:
            break
            
        # Pass over alpha chars to find end
        end = at + 1
        while end < len(s) and s[end].isalpha():
            end += 1
        
        word = s[at + 1:end]
        # Screen out len-0 word
        if len(word) > 0:
            words.append(word)
        
        # Set up next iteration
        search = end
    return words

exclaim_words()

exclaim_words(s): For each '!' in s, parse out the "word" substring of one or more alphabetic chars which are immediately to the left of the '!'. Return a list of all such words including the '!', so 'x hey!@ho! returns ['hey!', 'ho!']. (Like at_words, but right-to-left)

parse_words()

parse_words(s): Given a string s, parse out and return all "words", where a word is made of 1 or more adjacent alphabetic chars, so '^ abc xyz$' returns ['abc', 'xyz'].

parse_words() Solution

def parse_words(s):
    search = 0
    words = []
    while True:
        # Your code here
        pass
        # Find a first alpha char (note: not)
        begin = search
        while begin < len(s) and not s[begin].isalpha():
            begin += 1
        
        # No alphas found -> done
        if begin >= len(s):
            break
        # True here: s[begin] is first alpha
        
        # Move end past the group of alphas
        end = begin + 1
        while end < len(s) and s[end].isalpha():
            end += 1
        
        # Now we know where it is
        word = s[begin:end]
        words.append(word)
        search = end
        # or end + 1
    return words

More Practice Problems

As a first goal, you should be able to solve above the above 4 problems which are each very solid programming challenges: all_brackets(), at_words(), exclaim_words(), parse_words()

These all have a common while-True, find-begin, find-end pattern to them.

Later more difficult exercises at the above url: max_words(), parse_words99()