L20

Today: dict output, wordcount.py example program, list functions, List patterns, state-machine pattern

Dict Load-Up vs. Output

Thus far we have concentrated on loading data into the dict
e.g. dict-count algorithm
Typical program pattern: read data file, load up dict, then do something with dict
BabyNames has this pattern
Possible output stage: print out all the data organized in the dict
e.g. wordcount below - read file, print out list of all words with their counts

Have Dict - How To See Contents?

Say you have loaded up a dict with your data
How do you loop over the whole thing?
dict.keys()

dict.keys()

The function dict.keys() returns a list-like collection of the dict keys
Loop over dict.keys() to access every key/value in dict
Recall: don't modify a thing while looping over it (list, dict)
The keys are in a "random" order
Actually it's the order they were added
But it's random-looking to the end user
Also dict.values()
List like collection of all the values
Used less often than .keys()

>>> # Load up dict
>>> d = {}
>>> d['a'] = 'alpha'
>>> d['g'] = 'gamma'
>>> d['b'] = 'beta'
>>>
>>> # d.keys() - list-like "iterable" of keys,
>>> # loop over keys to see all of dict
>>> d.keys()
dict_keys(['a', 'g', 'b'])
>>>
>>> # d.values() - list of values, not used so often
>>> d.values()
dict_values(['alpha', 'gamma', 'beta'])

Dict Output v1

Say we want to print the contents of a dict. Loop over d.keys(), for each key, look up the value for that key. This works fine and accesses all of the dict. The only problem is that the keys are in random order.

>>> d = {'a': 'alpha', 'g': 'gamma', 'b': 'beta'}
>>> for key in d.keys():
...   print(key, '->', d[key])
... 
a -> alpha
g -> gamma
b -> beta

Dict Output v2 - `sorted(d.keys())`

The function sorted(xxx) takes in any linear collection
Returns a new list with those elements, sorted into increasing order
We will explore sorted() in more detail later
Works with d.keys()
Below is the standard way to print out a dict with the keys in sensible order
The wordcount example will use this pattern

>>> d.keys()          # random order - not pretty
dict_keys(['a', 'g', 'b'])
>>>
>>> sorted(d.keys())  # sorted order - nice
['a', 'b', 'g']
>>>
>>> for key in sorted(d.keys()):
...   print(key, '->', d[key])
... 
a -> alpha
b -> beta
g -> gamma

WordCount Example Code - Rosetta Stone!

The wordcount program below reads in a text, separates out all the words, builds a count dict to count how often each word appears, and finally produces a report with all the words in alphabetical order, each with its count, this:

aardvark 1
anvil 3
ban 1
boat 4
be 19
...

The word count program is a sort of Rosetta stone of coding - it's a working demonstration of many important features of a computer program: strings, dicts, loops, parameters, return, decomposition, testing, sorting, files, and main(). The complete text of wordcount.py is at the end of this page for reference. If you need to remember some bit of Python syntax, there's a good chance there's an example in this program. Or if you are learning a new computer language, you could try to figure out how to write wordcount in the new language to get started.

wordcount.zip

Sample Run

Note how words are cleaned up of non-alphabetic chars.

$ cat poem.txt
Roses are red
Violets are blue
"RED" BLUE.
$
$ python3 wordcount.py poem.txt 
are 2
blue 2
red 2
roses 1
violets 1
$
$ python3 wordcount.py alice-book.txt   # whole book
...lots...
yourself 10
youth 6
zealand 1
zigzag 1
$

Look at source code for it.

1. clean(s) - Utility Code

The clean(s) function is used to clean punctuation from the edges of words, like given '--woot!' extract just 'woot'. It is written as a black-box function with Doctests, of course! The counting code will use this to clean up each word as it goes.

def clean(s):
    """
    Given string s, returns a clean version of s where all non-alpha
    chars are removed from beginning and end, so '@@hi^^' yields 'hi'.
    The resulting string will be empty if there are no alpha chars.
    >>> clean('$abc^')      # basic
    'abc'
    ...
    """

2. read_counts(filename) - Central Algorithm

def read_counts(filename):
    """
    Given filename, reads its text, splits it into words.
    Returns a "counts" dict where each word
    is the key and its value is the int count
    number of times it appears in the text.
    Converts each word to a "clean", lowercase
    version of that word.
    The Doctests use little files like "test1.txt" in
    this same folder.
    >>> read_counts('test1.txt')
    {'a': 2, 'b': 2}
    >>> read_counts('test2.txt')    # Q: why is b first here?
    {'b': 1, 'a': 2}
    >>> read_counts('test3.txt')
    {'bob': 1}
    """
    with open(filename) as f:
        text = f.read()  # read file as string vs for/line/in way

    # .split() with no parameters splits on whitespace,
    # and \n counts as whitespace, so get whole file
    # ['Roses', 'are', 'red', 'Violets', 'are', ...]
    words = text.split()

    counts = {}
    for word in words:
        word = word.lower()
        cleaned = clean(word)  # style: call clean() once, store in var
        if cleaned != '':      # subtle - cleaning may leave only ''
            if cleaned not in counts:
                counts[cleaned] = 0
            counts[cleaned] += 1
    return counts

3. print_counts(counts) - Output Code

In the print_counts() function, the counts dict is passed in and the totally standard dict output code is used to print out the words and their counts, one per line, in alphabetical order.

def print_counts(counts):
    """
    Given counts dict, print out each word and count
    one per line in alphabetical order, like this
    aardvark 1
    apple 13
    ...
    """
    for word in sorted(counts.keys()):
        print(word, counts[word])

Whole Program - How Functions Work

The wordcount program shows how the big-picture program can be built up with individual black-box functions.

First, look at the individual read_counts() and print_counts() functions, just in terms of their black-box inputs and outputs.

1. read_counts(filename)

Black box view of it - just input and output.

Given filename, read the text of that file, build and return a counts dict of all the words in that file in lowercase form. So as black box, this takes in a filename and returns a counts dict.

def read_counts(filename):
    """
    filename string in -> counts dict out
    ...
    """

2. print_counts(counts)

Given the counts dict, prints out its contents in alphabetical order. So as black box, this takes in a counts dict, and writes to standard output, returning nothing.

def print_counts(counts):
    """
    counts dict in -> prints to standard output
    returns nothing
    ...
    """

3. main() code

We want to break the program into many little black-box functions. But how do they fit together? How does the data get from one to another?

Look at the read_counts() line in main(). It calls read_counts() passing the filename in, getting back a counts dict which is stored in a "counts" variable. On the next line, main() calls print_counts(), passing in the new counts dict for printing. This shows how to knit together black box functions - feeding each the input it needs, getting its output, passing that on as the input to the next function.

def main():
    args = sys.argv[1:]
    ...

    # args[0] is filename
    if len(args) == 1:
        counts = read_counts(args[0])
        print_counts(counts)
    ...

alt: wordcount main() calling its helper functions

Timing Tests (optional)

Try more realistic files. Try the file alice-book.txt - the full text of Alice in Wonderland full text, 27,000 words. tale-of-two-cities.txt - full text, 133,000 words. Time the run of the program, see if the dic†/hash-table is as fast as they say with the command line "time" command. The second run will be a little faster, as the file is cached by the operating system.

$ time python3 wordcount.py alice-book.txt
...
...
youth 6
zealand 1
zigzag 1

real	0m0.103s
user	0m0.079s
sys	0m0.019s

Here "real 0.103s" means regular clock time, 0.103 of a second, aka 103 milliseconds, aka about a tend of a second elapsed to run this command.

Note in Windows, you need the "Powershell" terminal, not the more primitive terminal it uses by default. Here are instructions for enabling PowerShell.

Windows PowerShell equivalent to "time" the run of a command:

$ Measure-Command { py wordcount.py alice-book.txt }

Let's try it with the book A Tale of Two Cities

$ time python3 wordcount-solution.py tale-of-two-cities.txt
... lots of printing ...
zealous 2

real	0m0.216s
user	0m0.180s
sys	0m0.027s
$

So that takes 0.21 seconds. There are about 133,000 words in the Tale of Two Cities. Here are the key lines for each word:

    if word not in counts:   # 1 dict "in"
        counts[word] = 0
    counts[word] += 1        # 1 dict get, 1 set

Each word hits the dict 3 times: 1x "in", then at least 1 get and 1 set for the +=. So how long does each dict access take?

>>> 0.21 / (133000 * 3)
5.263157894736842e-07

Ten to the -7 is a tenth of a millionth, so with our back-of-envelope math here, the dict is taking 1/2 of a millionth of a second per dict access. In reality it's faster than that, as we are not separating out the time for the file reading and parsing which went in to the 0.21 seconds. Nonetheless the basic claim about dicts is here - they are super fast accessing per key, even if the number of keys is large.

List 1.0 Features

List Slices

Slices work with lists
Exactly like Strings
lst[start:end]
Elements starting at start
Up to but not including end UBNI
Creates a new list
Populated with elements from original list
lst[:] copies the whole list
lst[-1] is the last element

>>> lst = ['a', 'b', 'c']
>>> lst2 = lst[1:]   # slice without first elem
>>> lst2
['b', 'c']
>>> lst
['a', 'b', 'c']
>>> lst3 = lst[:]    # copy whole list
>>> lst3
['a', 'b', 'c']
>>> # can prove lst3 is a copy, modify lst
>>> lst[0] = 'xxx'
>>> lst
['xxx', 'b', 'c']
>>> lst3
['a', 'b', 'c']

`lst.pop([optional index])`

Need something the opposite of .append()
How to pull elements out of a list, shrinking it?
lst.pop() - removes last elem from list, returning it
Mnemonic: opposite of append()
pop(index) - takes an optional index number
Pops off that element instead of the end one
pop(0) - pops the first element
The list elements are kept in a contiguous block
Shifting elements over so they are indexed 0..len(lst)-1
Error if the list is empty - reasonable!

>>> lst = ['a', 'b', 'c']
['a', 'b', 'c']
>>> lst.pop()   # opposite of append
'c'
>>> lst
['a', 'b']
>>> lst.pop(0)  # can specify index
'a'
>>> lst
['b']
>>> lst.pop()
'b'
>>> lst.pop()
IndexError: pop from empty list
>>>

`del` on Lists

The del operator deletes parts of structures
Its syntax is odd - the word del followed by an expression identifying the thing to delete
It works with a few types, but in particular it works with lists
del lst[1] - deletes element at index 1 in the list
del lst[:2] - deletes the whole slice
Elements in the list are shifted around after the delete, so they are still index 0..len-1
Building up a data structure is very common, deletion less so
We show how to delete here, but don't expect to use it often

>>> lst = ['a', 'b', 'c', 'd', 'e']
>>> 
>>> lst
['a', 'b', 'c', 'd', 'e']
>>> del lst[1]         # del at index 1
>>> lst
['a', 'c', 'd', 'e']   # now it's gone
>>> 
>>> del lst[:2]        # del slice of first 2 elems
>>> lst
['d', 'e']

`del` on Dict

del works with dict too
del d['a'] removes that key/value

>>> d
{'a': 'alpha', 'g': 'gamma', 'b': 'beta'}
>>> 
>>> del d['b']    # del key/value
>>> 
>>> d
{'a': 'alpha', 'g': 'gamma'}
>>> 
>>> del d['x']    # key must exist or error
KeyError: 'x'

Bonus: `del` Variable

How do you set a variable?
=
del can un-set a variable, like it was never there
A move of questionable utility

>>> x = 6
>>> 
>>> x
6
>>> 
>>> del x
>>> 
>>> x
NameError: name 'x' is not defined
>>>

List 2.0 Features

Below are more rare list features. If a CS106A problem would use one of these, the problem statement will mention it. We're just mentioning them for completeness.

Here are some "list2" practice problems on the server if you are curious, but you do not need to do these.

> list2 exercises

lst.extend(lst2)

Unlikely to use in CS106A - mentioning for completeness
lst.extend(lst2) - Make lst longer with lst2 elements
a = [1, 2]
b = [3, 4]
a.extend(b)
Now a is [1, 2, 3, 4]
append() is super common
extend() in the related, rare function
See questions below:

>>> a = [1, 2, 3]
>>> b = [4, 5]
>>> a.append(b)
>>> # Q1 What is a now?
>>>
>>>
>>>
>>> a
[1, 2, 3, [4, 5]]
>>>
>>> c = [1, 3, 5]
>>> d = [2, 4]
>>> c.extend(d)
>>> # Q2 What is c now?
>>>
>>>
>>>
>>> c
[1, 3, 5, 2, 4]

Alternative: lst1 + lst2

lst1 + lst2 - create bigger list of all their elements
Maybe easier to understand than extend()
Like string
+ leaves the original lists unchanged
Constructs a new list to hold answer
vs .extend() modifies existing list

>>> a = [1, 2, 3]
>>> b = [9, 10]
>>> a + b
[1, 2, 3, 9, 10]
>>> a   # original is still there
[1, 2, 3]

`lst.insert(index, elem)`

Unlikely to use in CS106A - mentioning for completeness
lst.insert(index, elem) - insert at given index
Alternative to append()
Elements in list are shifted over automatically

>>> lst = ['a', 'b']
>>> lst.insert(0, 'z')
>>> lst
['z', 'a', 'b']

`lst.remove(target)`

Unlikely to use in CS106A - mentioning for completeness
lst.remove(xxx) - search for and remove first xxx elem
Error if it's not there already - use in to check
Observe: append(), extend(), pop(), insert(), remove() .. all modify the list
In contrast to immutable string, functions always return new strings

>>> lst = ['a', 'b', 'c', 'b']
>>> lst.remove('b')
>>> lst
['a', 'c', 'b']

Now we'll look some functions that are related to lists and we will use all of these.

1. sorted()

sorted() takes in list, or list-like collection
e.g. range() or dict.keys()'
sorted uses the operator <
5 < 6 -> True
'apple' < 'banana' -> True
Creates and returns increasing order sorted list
Original list is not changed
int elements - numeric ordering
string elements - alphabetical, starting with leftmost char
Uppercase before lowercase, deal with this later
has reverse=True optional "named" parameter
Named params like this: no space around =
Error to mix int/str elements
Remember: sorting is somewhat costly, don't do it for no reason
CS106B: implement your own sorting

>>> sorted([45, 100, 2, 12])               # numeric
[2, 12, 45, 100]
>>> 
>>> sorted([45, 100, 2, 12], reverse=True)
[100, 45, 12, 2]
>>> 
>>> sorted(['banana', 'apple', 'donut'])   # alphabetic
['apple', 'banana', 'donut']
>>>
>>> sorted(['45', '100', '2', '12'])       # fix later
['100', '12', '2', '45']
>>> 
>>> sorted(['45', '100', '2', '12', 13])
TypeError: '<' not supported between instances of 'int' and 'str'

2. min(), max()

These are related to sorted() - returning 1 elem
Use this builtin to pick out smallest/largest value
Works with several params, or with a list
Works with int
Works with str
Works with anything where "<" has meaning
Error with empty list, must have at least 1 value
Note not object noun.verb style, a function like sorted()
min()/max() much faster than sorted() - use these if just need the one value
Style reminder:
Don't use the name of a built-in function as a variable name
e.g. don't use "min" or "max" as a var name, though it's very tempting!

>>> min([1, 3, 2])
1
>>> max([1, 3, 2])
3
>>> min([1])        # len-1 works
1
>>> min([])         # len-0 is an error
ValueError: min() arg is an empty sequence
>>>
>>> min(['banana', 'apple', 'zebra'])  # strs work too
'apple'
>>> max(['banana', 'apple', 'zebra'])
'zebra'
>>>
>>> min(1, 3, 2)  # w/o list form
1
>>> max(1, 3, 2)
3

3. sum()

Compute the sum of a collection of ints or floats, like +.

>>> nums = [1, 2, 1, 5]
>>> sum(nums)
9

List Code Pattern Examples

Look at the "listpat" exercises on the experimental server

> listpat exercises

Patterns

As we've seen, when you're looking at a problem, you never want to think of it as this brand new thing you've never solved before. There's always parts of it that are idiomatic, or following some pattern you've seen before.

State-Machine Pattern

Have a "state" variable
1. Init the variable before the loop (short for "initialize")
2. Loop over the elements. For each element, look at or update the state
3. After the loop, use state variable to compute the final result
Strategy idea: push complexity into the variable, less code overall

alt: state machine aside list

Recall: += Accumulate Result

Many functions we've done before actually fit the state-machine pattern, e.g. double_char(). We have a "result" set to empty before the loop. Something gets appended to result in the loop, possibly with some logic. We've used this with both strings (+=) and lists (.append).

# 1. init state before loop
result = ''

loop:
   ....
   # 2. update state in the loop
   if xxx:
       result += yyy


# 3. Use state to compute result
return result

State-Machine Exercise: min()

Use the state-machine strategy to solve something a little more interesting.

> min()

min() function - exercise or example
Given list of numbers, return the min value
Don't sort the numbers - unnecessarily costly
Strategy
Keep "best" state variable - smallest element seen
What is the init value of best?
Foreach over the numbers, update best for each number
Is this number I'm looking at the new best?
aka "King Of The Mountain" game on playground
Or The Wire "You come at the king, you best not miss"

The style "len rule": we have Python built-in functions like len() min() max() list(). Avoid creating a variable with the same name as an important function, like "min" or "list". This is why our solution uses "best" as the variable to keep track of the smallest value seen so far instead of "min".

min() Solution

def min(nums):
    # best tracks smallest value seen so far.
    # Compare each element to it.
    best = nums[0]
    for num in nums:
        if num < best:
            best = num
    return best

min(): Init value of best?

What init value for best variable?
Might think - use max possible int value
Doesn't work that well, weirdly Python does not have a max int
Using nums[0] is a nice, correct strategy here

State-Machine - digit_decode()

> digit_decode()

Say we have a code where most of the chars in s are garbage. Except each time there is a digit in s, the next char goes in the output. Maybe you could use this to keep your parents out of your text messages in high school.

'xxyy9H%vvv%2i%t6!' -> 'Hi!'

How Might You Solve This?

I can imagine writing a while loop to find the first digit, then taking the next char, .. then the while loop again ... ugh.

Strategy: take_next State-Machine

Have a boolean variable take_next which is True if the next char should be taken (i.e. the char of the next iteration of the loop) and False otherwise.

Write a nice, plain loop through all the chars. Set take_next to True when you see a digit. For each char, look at take_next to see if it should be taken. The details of the code in the loop area little tricky.

alt: set take_next to True for each digit

Just Try It

Type in some code that is an attempt. Run it, see the output, work from there. The bugs are tricky, but for this problem, easier to work out by running and seeing the output. Put some code in there and set about debugging it.

You could solve this using index numbers and -1. However, it's worth working out this state-machine approach which does not rely on index numbers at all.

digit_decode() Solution

def digit_decode(s):
    result = ''
    take_next = False
    for ch in s:
        if take_next:
            result += ch
            take_next = False
        if ch.isdigit():
            take_next = True
        # Set take_next at the bottom of the
        # loop, taking effect on the next char
        # at the top of the loop.
    return result

Later Practice - upper_hat()

A more difficult state-machine problem for more practice

> upper_hat()

State-Machine - "previous" Technique

A classic state-machine technique (CS106B uses this one)
Challenge: how many elems are the same as the elem to their left
Have a "previous" state var
Before the loop, init previous with a harmless value, e.g. None or ''
Last line in loop: previous = elem
Then for each loop iteration:
Have current element
Have "previous", the value from the previous loop iteration

Previous pattern:

# 1. Init with not-in-list value
previous = None

for elem in lst:
    # 2. Use elem and previous in loop
    
    # 3. last line in loop:
    previous = elem

Previous Drawing

Here is a visualization of the "previous" strategy - the previous variable points to None, or some other chosen init value for the first iteration of the loop. For later loops, the previous variable lags one behind, pointing to the value from the previous iteration.

alt: previous and num walking down list

Example - count_dups()

> count_dups()

count_dups(): Given a list of numbers, count how many "duplicates" there are in the list - a number the same as the value immediately before it in the list. Use a "previous" variable.

count_dups() Solution

The init value just needs to be some harmless value such that the == test will be False. None often works for this.

def count_dups(nums):
    count = 0
    previous = None      # init
    for num in nums:
        if num == previous:
            count += 1
        previous = num   # set for next loop
    return count

State-Machine Challenge - hat_decode()

A neat example of a state-machine approach. Optional - just if we have time.

The "hat" code is a more complex way way to hid some text inside some other text. The string s is mostly made of garbage chars to ignore. However, '^' marks the beginning of actual message chars, and '.' marks their end. Grab the chars between the '^' and the '.', ignoring the others:

'xx^Ya.xx^y!.bb' -> 'Yay!'

Solve using a state-variable "copying" which is True when chars should be copied to the output and False when they should be ignored. Strategy idea: (1) write code to set the copying variable in the loop. (2) write code that looks at the copying variable to either add chars to the result or ignore them.

alt: copying==True for chars to copy within s

There is a very subtle issue about where the '^' and '.' checks go in the loop. Write the code the first way you can think of, setting copying to True and False when seeing the appropriate chars. Run the code, even if it's not going to be perfect. If it's not right (very common!), look at the got output. Why are extra chars in there? How to rearrange the loop to fix it?

For reference - source code for wordcount

WordCount Example Code

#!/usr/bin/env python3

"""
Stanford CS106A WordCount Example
Nick Parlante

Counting the words in a text file is a sort
of Rosetta Stone of programming - it uses files, dicts, functions,
loops, logic, decomposition, testing, command line in main().
Trace the flow of data starting with main().
There is a sorted/lambda exercise below.

Code is provided for alphabetical output like:
$ python3 wordcount.py somefile.txt
aardvark 1
anvil 3
boat 4
...

**Exercise**

Implement code in print_top() to print the n most common words,
using sorted/lambda/items.

Then command line -top n feature calls print_top() for output like:
$ python3 wordcount.py -top 10 alice-book.txt
the 1639
and 866
to 725
a 631
she 541
it 530
of 511
said 462
i 410
alice 386
"""

import sys


def clean(s):
    """
    Given string s, returns a clean version of s where all non-alpha
    chars are removed from beginning and end, so '@@hi^^' yields 'hi'.
    The resulting string will be empty if there are no alpha chars.
    >>> clean('$abc^')      # basic
    'abc'
    >>> clean('abc$$')
    'abc'
    >>> clean('^x^')        # short (debug)
    'x'
    >>> clean('abc')        # edge cases
    'abc'
    >>> clean('$$$')
    ''
    >>> clean('')
    ''
    """
    # Move begin rightwards, past non-alpha punctuation
    begin = 0
    while begin < len(s) and not s[begin].isalpha():
        begin += 1

    # Move end leftwards, past non-alpha
    end = len(s) - 1
    while end >= begin and not s[end].isalpha():
        end -= 1

    # begin/end cross each other -> nothing left
    if end < begin:
        return ''
    return s[begin:end + 1]


def read_counts(filename):
    """
    Given filename, reads its text, splits it into words.
    Returns a "counts" dict where each word
    is the key and its value is the int count
    number of times it appears in the text.
    Converts each word to a "clean", lowercase
    version of that word.
    The Doctests use little files like "test1.txt" in
    this same folder.
    >>> read_counts('test1.txt')
    {'a': 2, 'b': 2}
    >>> read_counts('test2.txt')    # Q: why is b first here?
    {'b': 1, 'a': 2}
    >>> read_counts('test3.txt')
    {'bob': 1}
    """
    with open(filename) as f:
        text = f.read()  # read file as string vs for/line/in way

    # .split() with no parameters splits on whitespace,
    # and \n counts as whitespace, so get whole file
    # ['Roses', 'are', 'red', 'Violets', 'are', ...]
    words = text.split()

    counts = {}
    for word in words:
        word = word.lower()
        cleaned = clean(word)  # style: call clean() once, store in var
        if cleaned != '':      # subtle - cleaning may leave only ''
            if cleaned not in counts:
                counts[cleaned] = 0
            counts[cleaned] += 1
    return counts


def print_counts(counts):
    """
    Given counts dict, print out each word and count
    one per line in alphabetical order, like this
    aardvark 1
    apple 13
    ...
    """
    for word in sorted(counts.keys()):
        print(word, counts[word])
    # Alternately can use counts.items() to access all key/value pairs
    # in one step.
    # for key, value in sorted(counts.items()):
    #    print(key, value)


def print_top(counts, n):
    """
    (Exercise)
    Given counts dict and int n, print the n most common words
    in decreasing order of count
    the 1639
    and 866
    to 725
    ...
    """
    items = counts.items()
    # To get a start writing the code, could print raw items to
    # get an idea of what we have.
    # print(items)

    # Your code here - our solution is 3 lines long, but it's dense!
    # Hint:
    # Sort the items with a lambda so the most common words are first.
    # Then print just the first n word,count pairs
    pass
    # 1. Sort largest count first
    items = sorted(items, key=lambda pair: pair[1], reverse=True)
    # 2. Loop over slice of first n
    for word, count in items[:n]:
        print(word, count)


def main():
    # (provided)
    # Command line forms
    # 1. filename
    # 2. -top n filename   # prints n most common words
    args = sys.argv[1:]

    if len(args) == 1:
        # filename
        counts = read_counts(args[0])
        print_counts(counts)

    if len(args) == 3 and args[0] == '-top':
        # -top n filename
        n = int(args[1])
        counts = read_counts(args[2])
        print_top(counts, n)


if __name__ == '__main__':
    main()