In this example, we will extract a table of monthly and annual rainfall from 8 Northern California stations. The data is made available by the state department of water resources.

On inspecting the source of the page, the data is contained near lines 475-588. Here, we'll extract it and save it in a file.

In [1]:
import urllib
data = urllib.urlopen('http://cdec.water.ca.gov/cgi-progs/precip1/8STATIONHIST').read().split('\n')[474:587]
data[:10]
Out[1]:
['  1921  5.25 12.38 11.52 13.12  3.76  5.30  0.94  3.05  0.65  0.00  0.00  0.02  55.99',
 '  1922  1.39  3.16 11.22  3.21 14.42  8.37  1.58  2.22  0.98  0.14  0.08  0.01  46.78',
 '  1923  3.59  6.01 11.79  5.95  1.93  0.49  6.86  0.93  2.09  0.20  0.40  2.75  42.99',
 '  1924  2.15  0.46  2.77  3.55  3.94  2.67  0.89  0.05  0.08  0.00  0.14  0.40  17.10',
 '  1925  6.63  4.71  6.01  3.47 15.21  4.51  5.46  2.14  1.52  0.11  0.83  2.45  53.05',
 '',
 '  1926  1.90  3.53  2.88  5.63 11.55  0.61  6.45  1.49  0.00  0.00  0.24  0.08  34.36',
 '  1927  2.98 18.82  3.51  9.08 14.70  4.01  7.05  1.63  1.00  0.00  0.03  0.37  63.18',
 '  1928  4.31 10.83  3.82  4.81  3.45 12.84  4.21  0.63  0.42  0.00  0.05  0.35  45.72',
 '  1929  1.02  5.65  5.71  2.33  4.00  3.65  4.10  0.10  2.84  0.00  0.01  0.00  29.41']

The data contains some blank lines. We will remove these and split the lines into separate elements:

In [2]:
data = [l.split() for l in data if l]

Now let's write this to a file.

In [3]:
import csv
outfile = file('rainfall.csv', 'w')
writer = csv.writer(outfile)
for row in data:
    writer.writerow(row)
outfile.close()