Visualizing data with ggplot from Python

Using my rudimentary knowledge of Python, I was interested in exploring the use of rpy2 to eventually be able to bring together spatial data analysis done in Python, with some higher level tools in R – in this case the powerful graphics library ggplot2 to visualize the results.

My setup is Mac OS 10.7.3, Python 2.7, R 2.14. (R needs to be compiled with ‘–enable-R-shlib’, which the official CRAN binary for Mac is). Also required are Xcode and NumPy.

There is no binary for rpy2 for my configuration available, so I downloaded the source (2.2.3). Extract somewhere, change into the rpy2-2.2.3 directory and install with:

sudo python build install

The Python code below takes a csv file (output from a some prior geoprocessing done with ArcPy) and produces a graphic with a map and a scatterplot – see the comments for further details.

Data can be downloaded here.

  1. Curious why you didn't use R's read.csv function?

  2. More to the point, why not use pandas?

  3. Is it now possible to use ggplot w/ Pandas or simple Python objects/dataframes and skip the R stuff altogether, or do you have to be working with R objects to plot to ggplot.

    (I'd be interested to use ggplot instead of the matplotlib alternatives, but only if it was frictionless.

