Syllabus

Instructor & TAs

Instructors

Jonathan Taylor

  • Office: Sequoia Hall #137
  • Phone: 723-9230,
  • Email
  • Office hours: W 2:00-4:00
  • Zoom office hours will be held Wednesday 12:00-2:00 just before on-campus office hours.

Teaching Assistants & Office Hours

TA : Benjamin Seiler

  • Email
  • Office hours: T 1:00-3:00
  • Location: 380-381U

TA : Matteo Sesia

  • Email
  • Office hours: Th 8:30-10:30 AM
  • Location: Sequoia Hall 207

TA : Jun Yan

  • Email
  • Office hours: Th 1:00-3:00 PM
  • Location: Sequoia Hall 105

TA : Jingyi Kenneth Tay

  • Email
  • Office hours: W 12:00-2:00
  • Location: Sequoia Hall 207

Email list

The course has an email list that reaches all TAs as well as the professors: stats191-win1819-staff@lists.stanford.edu

As a general rule, you should send course related to this email list.

Questions can also be posed on gradescope.

Evaluation

For those taking 4 units:

  • 5 assignments (50%)
  • data analysis project (20%)
  • final exam (30%) (according to Stanford calendar: T 03/19 @ 8:30AM, Gates B1)

For those taking 3 units:

  • 5 assignments (70%)
  • final exam (30%) (according to Stanford calendar: T 03/19 @ 8:30AM, Gates B1)

Final exam

Following the Stanford calendar: Tuesday, March 19, 2019 @ 8:30AM-11:30 AM, Gates B1.

Schedule & Location

MWF 9:30-10:20, Gates B1

Textbook

Computing environment

We will use R for most calculations. Class notes are in the form of jupyter notebooks.

Jupyter for R

In order to run the R notebooks below, you will need to install Jupyter (easily done through Anaconda as well as enable the R kernel:

install.packages('IRkernel')
library(IRkernel)
IRkernel::installspec()

Prerequisites

An introductory statistics course, such as STATS 60.

Course description

By the end of the course, students should be able to:

  • Enter tabular data using R.
  • Plot data using R, to help in exploratory data analysis.
  • Formulate regression models for the data, while understanding some of the limitations and assumptions implicit in using these models.
  • Fit models using R and interpret the output.
  • Test for associations in a given model.
  • Use diagnostic plots and tests to assess the adequacy of a particular model.
  • Find confidence intervals for the effects of different explanatory variables in the model.
  • Use some basic model selection procedures, as found in R, to find a best model in a class of models.
  • Fit simple ANOVA models in R, treating them as special cases of multiple regression models.
  • Fit simple logistic and Poisson regression models.

Project

The data analysis project description describes what is needed for your project.

Practice exam

You can find a practice exam here with solution.

Topics

  1. Course introduction and review: HTML, Jupyter, R-Markdown
  2. Some tips on R: HTML, Jupyter, R-Markdown
  3. Simple linear regression: HTML, Jupyter, R-Markdown
  4. Diagnostics for simple linear regression: HTML, Jupyter, R-Markdown
  5. Multiple linear regression: HTML, Jupyter, R-Markdown
  6. Diagnostics for multiple linear regression: HTML, Jupyter, R-Markdown
  7. Interactions and qualitative variables: HTML, Jupyter, R-Markdown
  8. Analysis of variance: HTML, Jupyter, R-Markdown
  9. Transformations and Weighted Least Squares: HTML, Jupyter, R-Markdown
  10. Correlated errors: HTML, Jupyter, R-Markdown
  11. Bootstrapping regression: HTML, Jupyter, R-Markdown
  12. Selection: HTML, Jupyter, R-Markdown
  13. Penalized regression: HTML, Jupyter, R-Markdown
  14. Logistic: HTML, Jupyter, R-Markdown
  15. Poisson: HTML, Jupyter, R-Markdown
  16. Final Review: HTML, Jupyter, R-Markdown

Assignments