~~BIOS 205: Introduction to R (Win, Spr, 1 Unit)~~

## NO LONGER OFFERED!

Unfortunately, this course is no longer offered. You are welcome to use the course resources listed below. This website will remain up indefinitely.

## Overview

This mini-course is a three week introduction to R, a widely used, open-source programming and data analysis environment. The course is designed for a Biosciences graduate student (but open to others) interested in using R for data analysis, and who wants some help getting started. The course is interactive, with each session a mixture of lecture and hands-on lab. Bring a laptop computer to each class. More instructions will be emailed prior to the first class.

## Prerequisites

This is an introduction to R. No prior experience with R, and only limited prior computer experience is expected. Those with significant statistics or software engineering experience are likely to find this course too elementary.

## How to sign up

Note that this course is frequently oversubscribed, so pay careful attention to the dates when registration opens. Priority is given to students, so there may be very limited space for postdocs. There is no room for auditors who are not registered by one of the means listed below. All at Stanford are welcome to use the course materials (videos and slides, see below) if they cannot take the class.

**Students:** Follow the normal procedure to register via Axess. If you
want to drop this class after the normal drop date, but before the
class starts, please follow the instructions on the Biosciences website.

**Postdocs:** Follow the instructions emailed to the postdocs by the
Biosciences Office each quarter, or see the Biosciences Mini-course webpage.
Please contact the instructor if this does not work. Note that I sort the
applications randomly, not by first-come, first-served, so there is no rush to register.

**Staff, faculty, other:** There is unlikely to be room, but email the
instructor (address below). Sometimes slots open up right before the
class starts.

## Grading Policy

This is an interactive lecture/lab, with some between-session exercises. There is no graded homework and no final exam. To receive a passing grade, you need to attend each session and participate. If you expect to miss more than one class session, please do not enroll.

## Learning Goals:

- Student will be able to use R/RStudio to enter and edit expressions and scripts.
- Student will be able to read tabular data (e.g., csv and Excel spreadsheets) from files.
- Student will be able to form subsets of, and to reshape, tabular data, and to make simple graphs from such data.
- Student will be able to find and install external R packages.
- Student will be able to reproduce part of a published paper and know about tools for reproducing computational research.

## Textbook

We will be using parts of R for Data Science by Hadley Wickham and Garrett Grolemund (O'Reilly Media, 2017); it is online and also available in hardcopy.

## Slides and Videos

### Spring 2017-18

**Slides:**Slides (html) Slides (Rpres)**Videos:**Videos

### Winter 2017-18

**Slides:**Slides (html) Slides (Rpres)**Videos:**Videos (unfortunately, there were some technical problems, so might want last year's instead)

### Spring 2016-17

**Slides:**Slides (html) Slides (Rpres)**Videos:**Videos

### Winter 2016-17

**Slides**: Slides (html) Slides (Rpres)**Videos**: Videos

## Downloads and Files

**R:** download (download and install the binaries, not the source)

**RStudio Desktop:** download

**Files:** BIOS205 data files

**Flashcards** (optional, but perhaps of use): BIOS205 flashcards Anki software

## Class schedule

Note that the Biosciences mini-course schedule is special and **extends
through finals period**, so please check your class schedule carefully.

### Spring Quarter, Academic Year 2017-18

**Location:** LKSC 101

**Dates:** May 29, 2018 – Jun 15, 2018

**Times:** Mon/Wed/Fri 9:30 am – 11:20 am, except; no class Mon May 28 and first session is **Tue** May 29

**Instructor:** Steven Bagley, MD, MS. (Stanford email: steven.bagley)

**TA:** Alejandro Schuler (Stanford email: aschuler)

## Syllabus and detailed outline with assigned readings

### Day 1 — Introduction, using RStudio, vectors, functions, data types

- Before:
- Install R and RStudio
- Future reference: R for Data Science (Ch. 20, Vectors)

- Introduction to the course
- Using RStudio
- Vectors
- Vector indexing
- Calling functions
- Data types and special values

### Day 2 — Data frames, filter, select, arrange, ggplot2

- Before:
- R for Data Science (Ch 5 Data transformation)
- Skim: RStudio IDE cheatsheet
- Skim: ggplot2 documentation pages: Index. ggplot2 2.1.0
- Future reference: For select, how to refer to columns
- Future reference: Data visualization with ggplot2 cheatsheet
- Future reference: How to make any plot in ggplot2?

- Using the RStudio script pane
- R Packages
`install.packages`

`library`

- Introduction to data frames
`filter`

`select`

`arrange`

- Introduction to graphics with ggplot2
- scatterplots

### Day 3 — Data frames, mutate, rename, distinct, reading from a file, groupby

- Before:
- R for Data Science (Ch 5 Data transformation)
- R for Data Science (Ch 11 Data import)
- R for Data Science (Section 5.6 Grouped summaries)
- Skim/reference: dplyr package function reference

- More data frame manipulation
`mutate`

: Adding columns to data frame- Selecting a range of columns
`rename`

`distinct`

- Reading data frame from a csv file
`read_csv`

- Groupby
- Lots of examples

### Day 4 — Chaining a sequence of operations, joining tabular data

- Before:
- R for Data Science (Ch 18 Pipes)
- R for Data Science (Ch 13 Relational data)
- Two-table verbs (more about joins)

- Chaining
- Manipulating tabular data: joins

### Day 5 — Data frames, tidy, tall, long format

- Before:
- Rearranging/changing the shape of rectangular data
`gather`

`spread`

- passing tidy data to ggplot

### Day 6 — More about ggplot2, the grammar of graphics, strings

- Before:
- For reference: ggplot2 book (through Stanford Searchworks): ggplot2 - Springer

- ggplot2 examples
- mapping variable to shape, size, color
- facets
- error bars

- The meaning of
`+`

- The grammar of graphics
- ggplot2 model: grammar of graphics (simplified version)
- String processing
`paste0`

### Day 7 — The NHANES Shape Lab, reproducible analysis

- Before:
- Reproducible research: Shape lab

### Day 8 — Reproducible analysis, Bring your own data day

- Before:
- R for Data Science (Ch 27 R Markdown)
- Optional: Example of data analysis in R (part 1), Example of data analysis in R (part 2)

- Reproducible analysis: R Markdown. Example R Notebook
- Open review of topics
- Lab: Bring your own data

### Day 9 — Learning more about R, Bioconductor, Bring your own data day

- Ways to learn more about R
- Task Views
- Bioconductor
- Lab: Bring your own data

## Resources for learning more about R

### Introduction to R

- De Vries and Meys, "R for Dummies" [book]. If you can get past the annoying title, this is a very helpful introduction to R.
- Venables and Smith, An Introduction to R [online]. This is an overview of most of the features of R.

### Graphics

- Murrell, R Graphics [book]. Very detailed explanation of base graphics, grid, and lattice.
- Winston Chang, Cookbook for R [online] and R Graphics Cookbook [book]. Many, many examples of ggplot2 code and output. Good if you want to find a graph and adapt it to your purposes.

### Data representation

- Murrell, Introduction to Data Technologies [online] How to represent information in a computer: HTML, CSS, spreadsheets, XML, SQL.

### Data science books

- Rafael A. Irizarry, Introduction to Data Science [online]
- Chester Ismay and Albert Y. Kim, ModernDive: An Introduction to Statistical and Data Sciences via R [online]

### Advanced examples

- I put together a collection of common tasks using functions from the various tidyverse packages. [online]

### Stanford resources

- Resources: Workshops, Consultations, Tutorials and more | Stanford Libraries (highly recommended)

## Students with documented disabilities

Students who may need an academic accommodation based on the impact of a disability must initiate the request with the Office of Accessible Education (OAE). Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty dated in the current quarter in which the request is made. Students should contact the OAE as soon as possible since timely notice is needed to coordinate accommodations. The OAE is located at 563 Salvatierra Walk (phone: 723-1066, URL: Office of Accessible Education | Student Affairs.