Section #3: Strings & File Reading

January 23rd, 2022

Written by Juliette Woodrow, Anna Mistele, John Dalloul, and Parth Sarin

String Slicing

For both parts of this problem, you can test your answers quickly in the python interpreter. This is a good warmup problem for the rest of section.

Part 1

          
                s = 'PythonTime'
    [0123456789]

How would you slice the string to receive the following results?

'ython'
'Py'
'Tim'
'Time'
'T'
'PythonTime'
'PYTHONTIME'
'pythontime'
'e'
'ime'

Remember, strings in Python are 0-indexed. In addition, the slice s[1:8] is inclusive of the first index, and exculsive of the second (that is, it will get the string beginning at index 1 and up to, but not including, index 8, i.e. 'ythonTi'). A neat (but advanced feature that you don't need to worry too much about for now) is that you can use negative indices to get letters from the back of the string. For example, the slice s[-2:] starts at the second to last character of the string and goes to the end of string. It would be 'me'.

Part 2

          
                s = 'yyy #xyz&ttt {CS106A} ^goodTimes !'
    [0123456789]

How would you .find and a slice on a string in the above format to receive the following results?

The substring between the '#' and '&' chars. In the above example it would be: 'xyz'
The substring starting at the '#' char and ending at the '&' char. In the above example it would be: '#xyz&'
The substring between the curly braces. In the above example it would be: 'CS106A'
The substring between the curly braces converted to lowercase. In the above example it would be: 'cs106a'
The substring starting after the '^' char to the end of the string converted to lowercase. In the above example it would be: 'goodtimes !'

String Searching

Implement the following functions:

bracket_money(s): If the string parameter contains one open curly brace'{' character with a closing curly brace '}' character after it, return the int version of the string between the two curly braces. If the string only has one curly brace, or the closing curly brace is not after the opening curly brace, return the empty string. You can assume that if there is an open curly brace followed by a closing curly brace, the substring inside these braces can be converted to an integer. Some examples:
- bracket_money('xxx yyy {100} zzz') would return 100
- bracket_money('xxx yyy {100 zzz') would return ''
- bracket_money('xxx yyy }100{ zzz') would return ''
defront(s) (Optional): If the string parameter has more than 1 character, return it without its first two characters. Otherwise, return the original string. What would we expect this function to return if the string is exactly 2 characters long?
x_end(s) (Optional): If the string parameter contains the 'x' character, return the string, from the first 'x' to the end of the string. Otherwise, return an empty string. For example, calling x_end('excited') would return the string 'xcited'.
is_valid_password(s) (Optional): Returns True if the string s meets the following criteria, and False otherwise. Password Criteria:
- Contains the special characters ! and &, in that order (but characters may occur before, in between, and after them)
- Contains explicitly more characters before the ! than after the &
Yes, these criteria are as arbitrary and annoying as actual password requirements
at_words(s) (Optional): If the string parameter contains 2 or more '@' characters, return the substring between the first two such characters. Otherwise, return the empty string. For example, calling at_word('xx@hello@xx') returns the string 'hello'.
You might find the 2-parameter version of the s.find(target, start) function useful. This function returns the index in s of the first instance of the string target, searching the range s[start: len(s)].

File Reading

Py-Libs

Write a function, create_pylib(filename) that reads a PyLib (like a Madlib but with help from you and your Python skills) from the given filename and prints out a completed story! For each line in the file replace any bracketed categories (like [noun]) with a word from that category. We've provided you with a get_word(category) function that given a category gives you back a random word of that category. There will be exactly one bracketed category per line. Print out the completed Pylib line by line.

Here is an example file to call your function on: story.txt

  When covid ends the first place I am traveling to is [location].
  I cannot wait to [verb] when I get there! 
  I will make sure to bring [noun].
  Until then, I will [adverb] await the trip.

GameStop Trades (Optional)

Counting Orders

In this problem, you're going to explore how your new string parsing skills can help reveal insights into large amounts of data: in this case, stock data for Gamestop trades. You'll also consider how these sorts of techniques can be used unfairly to limit equal access to financial resources.

Write a function, parse_trades(filename) that takes in the name of a file containing trades for the NYSE stock Gamestop (GME) and determines the total number of shares bought and sold. Each line of the file is a trade, consisting of: a trade ID, stock symbol, trade type, number of shares transacted, and the entity conducting the transaction. The trade type for a given trade comes after '||' and is followed by a single '|'. The price for a given trade starts with '$$' and is followed by a single '$'. The entity conducting the transaction comes after '&&' and is followed by a single '&'. The only two trade types are "BUY" and "SELL." We have provided you with a helper function that given a line of the file, a string that comes before a value, and the string that comes after a value, gives back the value between two of those strings in the line.

Consider the file: gamestop_trades.txt

        
4965 GME ||SELL| $$8$ &&KAREL_CO&  
2725 GME ||SELL| $$13$ &&KAREL_CO& 
9543 GME ||SELL| $$4$ &&J_DOE&  
8390 GME ||BUY| $$3$ &&KAREL_CO&  
9114 GME ||SELL| $$5$ &&NEMO&

If you run parse_trades('gamestop_trades.txt'), it should print:

  
3 shares bought.  
30 shares sold.

Market Manipulators

Often, when a single entity or group of entities decide to sell a large amount of stock at once, the price of that stock falls. With the rise in automated trading, concerns arise over large groups taking advantage of this trend and dumping stock to deliberately tank a stock price. Write a function find_percent_seller(filename, entity_name) that takes in the name of a file containing trades for the NYSE stock Gamestop (GME) and calculates the percentage of stock sold by the passed in entity name. The entity name for a given trade is preceeded by '&&' and followed by a single '&'. You may find it helpful to re-use the helper function suggested in the previous section.

Consider the file: gamestop_trades.txt

  
4965 GME ||SELL| $$8$ &&KAREL_CO&  
2725 GME ||SELL| $$13$ &&KAREL_CO& 
9543 GME ||SELL| $$4$ &&J_DOE&  
8390 GME ||BUY| $$3$ &&KAREL_CO&  
9114 GME ||SELL| $$5$ &&NEMO&

If you run find_percent_seller('gamestop_trades.txt', 'KAREL_CO'), it should print:

  
KAREL_CO sold 70 percent of GME stock sold.

Market Manipulators Ethical Discussion

With the rise in automated trading, the gap between the capabilities of large corporations and individual retail investors in access to the financial markets has grown considerably. The distance to trading centers, computational power, and priority access all have significant impacts on trading ability within the world of automated trading. In what ways has automation widened this gap? In what ways can the same automation that has grown the gap help narrow it (think about what you just implemented!)? And if the gap is a problem, who is responsible for addressing it?

String Construction (Optional)

Implement the following functions:

make_gerund(s): which adds 'ing' to the end of the given string s and returns this new word. If s already ends with 'ing', add an 'ly' to the end of s instead. You may assume that s is at least 3 characters long.
put_in_middle(outer, inner): which returns a string where inner has been inserted into the middle of the string outer. To find the middle of a string, take the length of the string and divide it by 2 using integer division. The first half of the string should be all characters leading up to, but not including, the character at this index. The second half should start with the character at this index and include the rest of the characters in the string.

Word Puzzle (Optional)

Stacatto Words

We say that a word is a stacatto word if all of the letters in even positions are vowels (i.e., the second, fourth, sixth, etc. letters are vowels). For this problem, the vowels are A, E, I, O, U, and Y. For example, AUTOMATIC, CAFETERIA, HESITATE, LEGITIMATE, and POPULATE are stacatto words. Write a function is_stacatto(word) that returns True if a word is a stacatto word and False otherwise. For this problem, you can assume that word will be a string containing uppercase alphabetic characters only.