Stanford MS&E 226 – Fundamentals of Data Science

Downloading and using R

There is a computational component to this class. While you may use either Python or R, the “official” language of the course is R.

An easy interface to R that you can use on your local machine is RStudio Desktop, which is available free for non-commercial use.

R is powerful in part because of the range of packages available that increase its capabilities. After downloading and installing R, you will find it helpful to also load the following packages:

  1. tidyverse: A collection of useful packages including ggplot and dplyr

  2. arm: A set of helper functions from Andrew Gelman and Jennifer Hill's book.

To install packages run install.packages(’<package_name>’) at the R command prompt.

To load a package run library('package_name’) at the R command prompt.

(Note that LLMs such as ChatGPT are quite fluent in both R and Python, and can effectively translate between the two as well.)

Some links to get you started with R:

  1. R for Data Science

  2. Code School R tutorial

  3. R for beginners

  4. Stack Overflow

  5. Cross Validated

  6. ggplot2 homepage

  7. ggplot2 book (free using Stanford Libraries)

  8. Cookbook for R

  9. Quick-R

  10. R tutorial at Cyclismo