Data analysis examples using R

David Rogosa Sequoia 224, rag{AT}stat{DOT}stanford{DOT}edu

Course web page: http://web.stanford.edu/~rag/ed401/

For 2013 course materials go here

Data analysis examples using R. Ed401 Aut 2014 (1 unit) Description We will do basic and intermediate level statistical analysis examples (of the sort that students will have seen in their courses) in R. Examples include: descriptive statistics and plots, group comparisons, correlation and regression, categorical variables, multilevel data. See http://web.stanford.edu/~rag/ed401/ Course Schedule Five (2hr) mtgs Th 3:15-5 Oct 2 1. Descriptive stats; analysis of means (up through anova) Oct 9 2. Correlation and regression (up through multiple regression, variable selection etc) Oct 16 3. Categorical vars (tables, logistic regression) Oct 23 4. Multilevel data; descriptives, plots, and intro to mixed-effects models (e.g. Bryk HSB data) Nov 6 5. Student analyses (students present a small analysis of their own)

1/7/09. NY Times endorses R: Data Analysts Captivated by R's Power

Current version of R is version 3.1.1 (Sock it to Me) released on 2014-07-10

For references and software: The R Project for Statistical Computing Closest download mirror is Berkeley

Many students employ RStudio to enhance their R-enjoyment. I won't use it, but it serves very well especially on a single screen (e.g. portable) machine. "RStudio IDE is a powerful and productive user interface for R. It's free and open source, and works great on Windows, Mac, and Linux." A short R-intro that includes RStudio

The greatest challenge here is not being overwhelmed by all the options.

0. Reference Cards and other short documents section of CRAN page

1. When I taught the introductory course Stat141, the text for computing was

An online version available from John Verzani's page . alternate version, single pdf UsingR R-package

2. In Stat209 a primary resource for R and data analysis is

3.

4. From CRAN central: An Introduction to R Notes on R: A Programming Environment for Data Analysis and Graphics Version 3.0.1 (2013-05-16) W. N. Venables, D. M. Smith and the R Core Team

manual for package

i. correlation and scatterplots platelet session platelet plots platelet data extra stat141 example Brain and Body Weights for 62 Species of Land Mammals

ii. Straight-line regression single subject Sleepstudy example R session plots and handout version

more Coleman. using

Mroz87 data Mroz87 data description IV data analysis session Woolridge stata ivreg

10/15 Background exposition for IV and returns to schooling: Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Joshua D. Angrist; Alan B. Krueger,

i. single variable. Stef Van Buren example

ii. traditional bivariate multivariate data methods, correlation and regression example

iii. Multiple Imputation.

nhanes data in package

Background materials, Multiple Imputation in R. van Buuren S and Groothuis-Oudshoorn K (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. see also multiple imputation online Flexible Imputation of Missing Data. Stef van Buuren Chapman and Hall/CRC 2012. Book contents online book extras He is the originator of

So I gathered together some quick resources, esp for use within R-studio where use of

RStudio help. Using Sweave and knitr also Using Sweave and LaTeX with R 3.0.2 Rstudio support queries: 1 2

Some additional intro docs. San Diego State UW Montana Wharton,UPenn Germany Minnesota

Also the

Addendum on scripts. Introduction to the R Project for Statistical Computing for use at ITC Appendix B ; A (very) short introduction to R scripts section; Kickstarting R - Writing R scripts

a. Aggregation and ecological correlations Robinson (1950) 2x2 table ex

b. High School and Beyond data. complete Bryk dataset Data construction from files in the MEMSS

c. First pass, Bryk data: session plots

d. Additional plots for Multilevel data. R session xyplots

e. Comparison of Public and Catholic Schools using

f. Ancova on school means (school level) HSB: analysis of covariance on group means school means dataset, HSB ancova

Background: Lecture slide, lme lmer for Bryk data Collection of HSB data analyses from various text sources A nice teaching document from Indiana that does HSB data with every known statistical package (including