Stat 209-- Course Files, Readings, Examples


Week 1--Course Introduction; properties of regression models
In the news
1.   New Yorker. December 23, 2013. The Power of the Hoodie-Wearing C.E.O.    Publication: The Red Sneakers Effect: Inferring Status and Competence from Signals of Nonconformity Author(s): Silvia Bellezza, Francesca Gino, and Anat Keinan Source: Journal of Consumer Research
2. WSJ 12/25. Fake Knee Surgery as Good as Real Procedure, Study Finds. Publication: Arthroscopic Partial Meniscectomy versus Sham Surgery for a Degenerative Meniscal Tear. N Engl J Med 2013; 369:2515-2524 December 26, 2013

Lecture topics
Quick Tour of course logistics and course materials
Main Topic: Meaning of regression coefficients: simple and multiple regression (including logistic)
Technical facts and foibles:
a. adjusted variables and regression coefficients--values of coefficients depend crucially on what else is used in the regression fit      conditioning vs controlling
b. effects of errors in measurement on regression coefficients

Lecture materials
MT woes of regression coefficients slides
Handout. Coleman data: adjusted-variables multiple regression   data file, 20 schools      using pairs command    Adjusted variable plot
slide for regression recursion
Measurement error: Maindonald-Braun sec6.7 results and R-functions
Regression examples, publications:
a.  Do Breast-Fed Baby Boys Grow Into Better Students?   Publication: Breastfeeding Duration and Academic Achievement at 10 Years. Wendy H. Oddy, Jianghong Li, Andrew J. O. Whitehouse, Stephen R. Zubrick, Eva Malacova. Pediatrics; Vol 127, Numb 1, Jan 2011   
b.  Pediatrics 2006;117;1018-1027   Sexy Media Matter: Exposure to Sexual Content in Music, Movies, Television, and Magazines Predicts Black and White Adolescents' Sexual Behavior  UNC Teen Media Center (NICHD funded)


Week 1 Readings
Primary Readings
Background piece: Correlation and Causation: A Comment, Stephen Stigler Perspectives in Biology and Medicine, volume 48, number 1 supplement (winter 2005)
Freedman text Ch. 1 (esp. Yule on paupers, Snow on Cholera and Sec 1.5);(Ch 2-5 are advanced review of regression models)
   Chap 1 exs also in From Association to Causation: Some Remarks on the History of Statistics;  

Additional Resources
Mosteller-Tukey, Chap 13 (Woes of regression coefficients)
Berk, Chap. 6,7 (Using and Interpreting Multiple Regression)
MB 3rd ed Ch.6. esp 6.2.2 adjusted variables; 6.2 Interpreting regression coefficients; 6.7 errors in variables
Background info, errors in variables. Short primer on test reliability  (Wm Trochin, Cornell)  Informal exposition in Shoe Shopping and the Reliability Coefficient    extensive technical material in Chap 7 Revelle text
Errors of Measurement in Statistics, W. G. Cochran , Technometrics, Vol. 10, No. 4. (Nov., 1968), pp. 637-666. JStor URL esp sections 8,9,11
Some Effects of Errors of Measurement on Multiple Correlation, W. G. Cochran Journal of the American Statistical Association Vol. 65, No. 329 (Mar., 1970), pp. 22-34 JStor URL esp sec 8 discussion.
An overview of latent variables in Ch 1 of Generalized Latent Variable Modeling Multilevel, Longitudinal, and Structural Equation Models Anders Skrondal and Sophia Rabe-Hesketh Chapman and Hall/CRC 2004

Week 2-- Association vs Causation; Experiments vs observational studies; Neyman-Rubin-Holland formulation

Lecture topics
A. Standardized Regression Coefficients (aside "beta weights" in Kool-Aid Psychology Scientific American, Jan 2010)
    Third-variable Topics
B. Spurious Correlation: some historical notes; partial and part correlations.
C. Simpson's paradox wiki page  Kidney stone example
D. Mediating/moderating variables  David Kenny web page  UTexas version     data analysis example
    First pass: experiments vs observational studies
E. Surveys of results from experimental and observational studies (see HRT, Mosteller below)
F. Introduction to Neyman-Rubin-Holland formulation for causal effects.
       presentation of NRH formulation for comparative studies based on Appendix of Holland (1988). Class handout.
       Illustration using encouragement design representation in Holland (1988).    copies of selected overheads.

Primary Readings
1. A multi-decade example: Experiments vs Observational studies, Hormone Replacement Therapy
   D.B. Petitti and D.A. Freedman. Invited commentary: How far can epidemiologists get with statistical adjustment? American Journal of Epidemiology vol. 162 (2005) pp. 415-18.       Freedman handout page
2. Freedman text Ch. 1 (esp Snow on Cholera and Sec 1.5) value of modeling Chap 10; response schedules sec 6.4
or online from week 1   Freedman Chap 1 exs also in From Association to Causation: Some Remarks on the History of Statistics;  
or   more on response schedules (text sec 5.4) in Statistical Models for Causation: A critical review    
    and   Statistical Models and Shoe Leather, Sociological Methodology, Vol. 21. (1991), pp. 291-313. JStor link
3. Paul Holland, Causal Effects and Encouragement Designs. Causal Inference, Path Analysis, and Recursive Structural Equations Models Paul W. Holland Sociological Methodology, Vol. 18. (1988), pp. 449-484.
Holland Appendix (esp pp. 475-480) presents the potential outcomes formulation.
Abstract Rubin's model for causal inference in experiments and observational studies is enlarged to analyze the problem of "causes causing causes" and is compared to path analysis and recursive structural equations models. A special quasi-experimental design, the encouragement design, is used to give concreteness to the discussion by focusing on the simplest problem that involves both direct and indirect causation. It is shown that Rubin's model extends easily to this situation and specifies conditions under which the parameters of path analysis and recursive structural equations models have causal interpretations.
      shorter (modern) versions of ATE, ATT intros. Causal inference from Harvard (slides 1-12);    treatment effects from MIT (pages 1-4);     estimating average treatment effects from Michigan State.

Additional Resources
Spurious correlation?
R-Package ppcor October 29, 2012 Title Partial and Semi-partial (Part) correlation
Correlations Genuine and Spurious in Pearson and Yule, John Aldrich Statistical Science, Vol. 10, No. 4. (Nov., 1995), pp. 364-376.  Jstor link
Spurious Correlation: A Causal Interpretation. Herbert A. Simon Journal of the American Statistical Association, Vol. 49, No. 267. (Sep., 1954), pp. 467-479. Jstor link

Simpson's Paradox.
R-package Simpsons.   Frontiers in Psychology. 2013; 4: 513. Simpson's paradox in psychological science: a practical guide

Experiments vs Observational studies:
Mosteller-Tukey Ch. 13 (esp sec 13G)
Bringing Evidence-Driven Progress To Education:   Coalition for Evidence-Based Policy          
Overdoing a good thing? Evidence-based medicine.    Hazardous journey Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials Gordon C S Smith, professor, Jill P Pell, consultant BMJ 2003;327:1459-1461
Classic paper on Medical experimentation. Statistics and Ethics in Surgery and Anesthesia. John P. Gilbert; Bucknam McPeek; Frederick Mosteller Science, New Series, Vol. 198, No. 4318. (Nov. 18, 1977), pp. 684-689.     JTSOR link

mediating/moderating variables
R-implementations: Barron-Kenny method via Sobel function in the multilevel package.  More extensive implementation (incl BCa bootstrapping) function mediation in package MBESS Ken Kelley; power and sample size calculations in package powerMediation
NEW and improved  mediation package. Causal Mediation Analysis Using R   This package (and pubs) takes the topic up a large level of complexity/capabilities
additional technical papers. Causal Mediation Analysis Using R K. Imai, L. Keele, D. Tingley, and T. Yamamoto    American Political Science Review Vol. 105, No. 4 November 2011 Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies
Mediation Analysis David P. MacKinnon, Amanda J. Fairchild, and Matthew S. Fritz Department of Psychology, Arizona State University, Tempe, Arizona 85287-1104; Annu. Rev. Psychol. 2007. 58:593–614
Mediators and Moderators of Treatment Effects in Randomized Clinical Trials Helena Chmura Kraemer; G. Terence Wilson; Christopher G. Fairburn; W. Stewart Agras Arch Gen Psychiatry. 2002;59:877-883
MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M.,West, S. G., Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83-104.

Neyman-Rubin-Holland models for comparative experiments (causal inference)
Rosenbaum Ch 2 (esp 2.5)
Statistics and Causal Inference, Paul W. Holland pp. 945-960 JASA 1986, another JSTOR link
Commentaries Donald Rubin, David Cox
Rubin, D. B., 1974, Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies, Journal of Educational Psychology, 66, 688-701.
Direct and Indirect Causal Effects via Potential Outcomes Donald B. Rubin Scandinavian Journal of Statistics Volume 31, Issue 2, Page 161-170, Jun 2004 .
Winship's repository Counterfactual Causal Analysis in Sociology
Useful intro lecture notes from Jonathan Wand, Political Science

Counterfactuals Stanford Encyclopedia of Philosophy Counterfactual Theories of Causation   wiki page     long Nancy Cartwright


Week 3-- Path analysis and causal modeling  multiple regression with pictures

Older news item, path analysis example: Publication: Longitudinal Effects of Violent Video Games on Aggression in Japan and the United States Craig A. Anderson, Akira Sakamoto, Douglas A. Gentile, Nobuko Ihori, Akiko Shibuya, Shintaro Yukawa, Mayumi Naito, and Kumiko Kobayashi Pediatrics 2008; 122: e1067-e1072. [lots of contradictory studies also]

Lecture topics
1. Path analysis introduction and examples (incl Blau-Duncan from Freedman chap 5).   class handout
        [time permitting a little on Structural equation models: introduction and examples.   old class handout]
2. Three-strikes against causal models. Does path analysis identify causal effects? Demonstrations of failure for Holland's encouragement design, Rogosa longitudinal examples (Goldstein, simplex).        class handout      Encouragement design slides

Week 3 Readings
Primary Readings
Freedman text Chap 5 (Chap 6 in revised ver). (Freedman Ch.4 has technical background on regression)
   response schedules, path analysis examples and potential outcomes in Statistical Models for Causation: A critical review    
Paul Holland: Encouragement design results; sections 3-5 Causal Inference, Path Analysis, and Recursive Structural Equations Models Paul W. Holland Sociological Methodology, Vol. 18. (1988), pp. 449-484.
David Rogosa. Casual Models Do Not Support Scientific Conclusions: A Comment in Support of Freedman.
Journal of Educational Statistics, Vol. 12, No. 2. (Summer, 1987), pp. 185-195. Jstor link
      Theme Song Ballad of the casual modeler    http://www.stanford.edu/class/ed260/ballad.mp3

Additional Resources
MB 13.1. Composite scores from multiple indicators (incl principal components). online version, a bit in sec6.1
Path Analysis, special issue: Journal of Educational Statistics Publication Info Vol. 12, No. 2, Summer, 1987 Issue Stable URL: http://www.jstor.org/stable/i249045  As Others See Us: A Case Study in Path Analysis(pp. 101-128) D. A. Freedman Stable URL: http://www.jstor.org/stable/1164888
Useful classnotes: G. David Garson Path Analysis: Statnotes, from North Carolina State University, Public Administration Program   Notre Dame
Technical details on Rogosa longitudinal examples:
     Rogosa, D. R. (1993). Individual unit models versus structural equations: Growth curve examples.
     In Statistical modeling and latent variables, K. Haagen, D. Bartholomew, and M. Diestler, Eds. Amsterdam: Elsevier North Holland, 259-281.
     Rogosa, D. R., & Willett, J. B. (1985). Satisfying a simplex structure is simpler than it should be.
     Journal of Educational Statistics, 10, 99-107. Jstor link
     original publication on the longitudinal path analysis:   Some Models for Analysing Longitudinal Data on Educational Attainment. Harvey Goldstein
      Journal of the Royal Statistical Society. Series A (General), Vol. 142, No. 4. (1979), pp. 407-442.  Jstor link

Path analysis intros
Path Analysis: Sociological Examples. Otis Dudley Duncan The American Journal of Sociology, Vol. 72, No. 1. (Jul., 1966), pp. 1-16. Jstor link
D.A. Freedman, Comments on Standardizing Path Diagrams: What Are the Parameters?
A recent reconsideration by a wise psychologist: The Path Analysis Controversy: A new statistical approach to strong appraisal of verisimilitude Meehl, Paul E; Waller, Niels G Psychological Methods. Vol 7(3), Sep 2002, pp. 283-300.  available from SU APA pubs

Structural equation modeling is a major industry in social and behavioral science with many texts (such as Principles and Practice of Structural Equation Modeling 2nd Edition Rex B. Kline; here's a long list), specialized courses (U. C Irvine MGMT 290   NC state PA 765), dedicated journals (Structural Equation Modeling: A Multidisciplinary Journal), and specialized computer programs (e.g., LISREL, EQS, AMOS).
Maximum likelihood factor analysis: A General Method for Analysis of Covariance Structures, K. G. Joreskog, Biometrika, Vol. 57, No. 2. (Aug., 1970), pp. 239-251.
Structural equation modeling from Scientific Software International
home of * Structural Equation Modeling (LISREL) Student editions, documentation, examples, etc
R resources (see also social science and psychometrics task views) sem Structural Equation Models package in R,   sem manual    OpenMx - Advanced Structural Equation Modeling
Two good structural equation model reviews:
Structural Equation Models William T. Bielby; Robert M. Hauser Annual Review of Sociology, Vol. 3. (1977), pp. 137-161. JStor link
Breckler, S. J. (1990). Applications of Covariance Structure Modeling in Psychology: Cause for Concern? Psychological Bulletin, 107, 260-273. here's a link that may be permanent



Week 4-- Multilevel data: Contextual effects, aggregation bias, Mixed-effects (lmer) models


In the news
Texting While Walking a Dangerous Combo: Pedestrians who did both had poorer balance, were less able to walk in straight line, researchers say    Texting Makes You Walk Like a Clumsy Robot     Publication: Texting and Walking: Strategies for Postural Control and Implications for Safety Published: January 22, 2014 DOI: 10.1371/journal.pone.008431

Lecture topics
1. Background: nested data, ecological fallacy, aggregation bias, levels of analysis.
2. Traditional approaches to multilevel analysis: contextual effects, school effects.
3. Advanced multilevel analyses: mixed effects models, linear and non-linear (via lme4).
        Collection of HSB data analyses from various text sources     Teaching document from Indiana, HSB from every statistical package
       HSB analyses in R, class and Lab2.         complete Bryk dataset     first pass, Bryk data:   session    plots         Lecture slide, lme lmer for Bryk data   side-by-side boxplots, SFYS analysis     
Week 4 Readings
Primary Readings
        Aggregation bias, Ecological fallacy.
D.A. Freedman. "Ecological inference and the ecological fallacy." International Encyclopedia for the Social and Behavioral Sciences. Elsevier (2001) vol. 6 pp. 4027-30. N. J. Smelser and Paul B. Baltes, eds. A one-page version: D.A. Freedman. "The ecological fallacy." In the Encyclopedia of Social Science Research Methods. Sage Publications (2004) Vol. 1 p. 293. M. Lewis-Beck, A. Bryman, and T. F. Liao, eds
        Current statistical analyses in social science: multilevel models.  Also Lab2
Maindonald-Braun Chap 10, esp 10.2, 10.5, 10.7-9   online version Ch.10
Berk 10.3
History of multilevel models from Scientific Software International, Inc
Using SAS PROC MIXED:    Judith Singer HLM/PROC Mixed papers: Multilevel Modelling Newsletter (HSB example) ; or
   JEBS1998 Using SAS PROC MIXED to Fit Multilevel Models, Jstor
Using R, lme, nlme.    John Fox lme tutorial   Fitting linear mixed models in R Using the lme4 package Douglas Bates (pp.27-30)

Additional Resources
Aggregation bias, Ecological fallacy.
D.A. Freedman. "The ecological fallacy." In the Encyclopedia of Social Science Research Methods. Sage Publications (2004) Vol. 1 p. 293. M. Lewis-Beck, A. Bryman, and T. F. Liao, eds
A Rule for Inferring Individual-Level Relationships from Aggregate Data, Glenn Firebaugh American Sociological Review Vol. 43, No. 4 (Aug., 1978), pp. 557-572   JStor URL
The original: Ecological Correlations and the Behavior of Individuals W. S. Robinson American Sociological Review Vol. 15, No. 3, Jun., 1950 . One of many followups: Some Alternatives to Ecological Correlation Leo A. Goodman American Journal of Sociology Vol. 64, No. 6, May, 1959
A good sociological/medical overview. Ecological effects in multi-level studies. Blakely TA, Woodward AJ. J Epidemiol Community Health. 2000 May;54(5):367-74.  pubmed   full text
American Journal of Epidemiology Vol. 139, No. 8: 747-760 Invited Commentary: Ecologic Studies -- Biases, Misconceptions, and Counterexamples S Greenland, J Robins
The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology J. Michael Oakes Social Science and Medicine 58 (2004) 1929–1952
R-package eiPack: R x C Ecological Inference and Higher-Dimension Data Management.  R News Oct 2007  
Educational multilevel data.
The Analysis of Multilevel Data in Educational Research and Evaluation Leigh Burstein Review of Research in Education, Vol. 8. (1980), pp. 158-233. Jstor link
Methodological Advances in Analyzing the Effects of Schools and Classrooms on Student Learning, Stephen W. Raudenbush; Anthony S. Bryk Review of Research in Education, Vol. 15. (1988 - 1989), pp. 423-475. Jstor link
Analyzing Multilevel Data in the Presence of Heterogeneous within-Class Regressions Leigh Burstein; Robert L. Linn; Frank J. Capell
Journal of Educational Statistics, Vol. 3, No. 4. (Winter, 1978), pp. 347-383. Jstor link

examples from analyses of voting data.
Bias in ecological regression   Stephen Ansolabehere and Douglas Rivers
David A. Freedman et al., "Ecological Regression and Voting Rights," Evaluation Review 1991, pp. 673-711, Berkeley Law postimg
Klein, S. P. and Freedman, D. A. (1993), "Ecological regression in voting rights cases," Chance, 6, 38–43.
D.A. Freedman, S.P. Klein, M. Ostland, and M.R. Roberts. "Review of 'A Solution to the Ecological Inference Problem.' " Journal of the American Statistical Association, vol. 93 (1998) pp. 1518-22; with discussion, vol. 94 (1999) pp. 352–57.

Current statistical analyses in social science: multilevel models.
Using SAS PROC mixed:   
Fitting Nonlinear Mixed Models with the New NLMIXED Procedure, Russell D. Wolfinger, SAS Institute Inc., Cary, NC
Judith Singer HLM/PROC Mixed papers: Multilevel Modelling Newsletter ; JEBS1998 Using SAS PROC MIXED to Fit Multilevel Models, Jstor
HLM - Hierarchical Linear and Nonlinear Modeling (HLM): descriptions and student edition HLM6
Freedman, D. A. (census adjustments). Hierarchical Linear Regression
Using R: lme4 (lmer and nlme) and mlmRev.    John Fox lme tutorial
Doug Bates draft book (Feb 2010)     Doug Bates SASmixed package   
Fitting linear mixed models in R Using the lme4 package Douglas Bates (pp.27-30)
London exam data example in Examples from Multilevel Software Comparative Reviews Douglas Bates
Regression diagnostics for lmer models. Package influence.ME   project resource page
mlmRev data examples. Also, Tennessee's Student Teacher Achievement Ratio (STAR) from Creating an R data set from STAR Douglas Bates
STATA does it also
HLM6 student edition   HLM setup for HSB example
lmer for SAS PROC MIXED Users Douglas Bates Department of Statistics University of Wisconsin Madison


Week 5.--The many uses and forms of analysis of covariance (including regression discontinuity designs)

In the news
Short man syndrome really does exist, Oxford University finds

Lecture topics
1. Review: formulation and purposes of analysis of covariance (including role in multilevel analysis)
   basic ancova exposition slides     reconsider week 4 HSB example HSB ancova handout      data for HSB ancova
2. Analyzing treatment effects as a function of covariate(s): CNRL, including Johnson-Neyman technique   cnrl data   cnrlanalysis
3. Uses of ancova with haphazard and with systematic assignment. Failures of ancova regression adjustments in observational studies.
4. Non-random assignment on the basis of the covariate, such as regression discontinuity designs.   Example from rdd manual


Week 5 Readings
Primary Readings
Ancova and extensions   Freedman section 5.6   (sec 6.6 p.103 in revised edition).    MB sec 7.3 ("fitting multiple lines"); online version, sec 5.7
Rogosa, D. R. (1980). Comparing nonparallel regression lines.   Psychological Bulletin, 88, 307-321. [a better quality scan from the APA site]
Regression Discontinuity Designs  Useful primers by Wm Trochin:  The regression-discontinuity design   regression-discontinuity analysis
Rubin, D. B., (1977), "Assignment to a Treatment Group on the Basis of a Covariate", Journal of Educational Statistics, 2, 1-26.   Jstor link

Additional Resources
      analysis of covariance: Background/historical papers:
Weisberg, H. I. Statistical adjustments and uncontrolled studies. Psychological Bulletin, 1979, 86, 1149-1164.
Covariance Adjustment in Randomized Experiments and Observational Studies Paul R. Rosenbaum Statistical Science, Vol. 17, No. 3. (Aug., 2002), pp. 286-304.   Jstor
Some Aspects of Analysis of Covariance, A Biometrics Invited Paper with Discussion. D. R. Cox; P. McCullagh Biometrics, Vol. 38, No. 3, (Sep., 1982), pp. 541-561.   Jstor
Analysis of Covariance: Its Nature and Uses William G. Cochran Biometrics, Vol. 13, No. 3, Special Issue on the Analysis of Covariance. (Sep., 1957), pp. 261-281. Jstor
The Use of Covariance in Observational Studies W. G. Cochran Applied Statistics, Vol. 18, No. 3. (1969), pp. 270-275. Jstor
Estimation of the Slope and Analysis of Covariance when the Concomitant Variable is Measured with Error James S. Degracie; Wayne A. Fuller Journal of the American Statistical Association, Vol. 67, No. 340. (Dec., 1972), pp. 930-937. Jstor
Deep background Neter-Wasserman text (Applied linear statistical models. Neter, Kutner, Nachtsheim and Wasserman 1996. Fifth edition. Homewood IL: Irwin, Inc.) chapters 22 and 8.

     Johnson-Neyman technique and aptitude-treatment interaction (ATI)
There is an R-project jnt that's never really gotten started.
Regions of Significant Criterion Differences in Aptitude-Treatment-Interaction Research Leonard S. Cahen; Robert L. Linn American Educational Research Journal, Vol. 8, No. 3. (May, 1971), pp. 521-530. Jstor
Identifying Regions of Significance in Aptitude-by-Treatment-Interaction Research Ronald C. Serlin; Joel R. Levin American Educational Research Journal, Vol. 17, No. 3. (Autumn, 1980), pp. 389-399. Jstor
Defining Johnson-Neyman Regions of Significance in the Three-Covariate ANCOVA Using Mathematica Steve Hunka; Jacqueline Leighton Journal of Educational and Behavioral Statistics, Vol. 22, No. 4. (Winter, 1997), pp. 361-387.  Jstor
discussion of substantive issues: Trait-Treatment Interaction and Learning David C. Berliner; Leonard S. Cahen Review of Research in Education, Vol. 1. (1973), pp. 58-94. Jstor

       Regression Discontinuity Designs
R-package--rdd;   Regression Discontinuity Estimation Author Drew Dimmery
Journal of Econometrics (special issue) Volume 142, Issue 2, February 2008, The regression discontinuity design: Theory and applications      Regression discontinuity designs: A guide to practice, Guido W. Imbens, Thomas Lemieux
    Also from Journal of Econometrics (special issue) Volume 142, Issue 2, February 2008, The regression discontinuity design: Theory and applications  Waiting for Life to Arrive: A history of the regression-discontinuity design in Psychology, Statistics and Economics, Thomas D Cook
the original paper: Thistlewaite, D., and D. Campbell (1960): "Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment," Journal of Educational Psychology, 51, 309-317.
Trochim W.M. & Cappelleri J.C. (1992). "Cutoff assignment strategies for enhancing randomized clinical trials." Controlled Clinical Trials, 13, 190-212.  pubmed link
Capitalizing on Nonrandom Assignment to Treatments: A Regression-Discontinuity Evaluation of a Crime-Control Program Richard A. Berk; David Rauma Journal of the American Statistical Association, Vol. 78, No. 381. (Mar., 1983), pp. 21-27. Jstor
Berk, R.A. & de Leeuw, J. (1999). "An evaluation of California's inmate classification system using a generalized regression discontinuity design." Journal of the American Statistical Association, 94(448), 1045-1052.  Jstor
lecture notes  London School of Economics
Econometric treatments using Neyman-Rubin causal formulation.  
Another look at the Regression Discontinuity Design
Eligible Non-Participant And Ineligible Individuals As A Double Control Group In Regression Discontinuity Designs, Erich Battistin, Enrico Rettore, Proceedings of Statistics Canada Symposium 2002  more econometrics


Week 6.-- Instrumental variable methods, simultaneous equations
In the news
Valentines Day Research: Sweet tooth linked to heart attacks , BBC, JAMA

note: Week 6 will start 2/13; 2/11 lecture will complete Week 5 materials.

Lecture topics
1. Intro IV (Disattenuation, omitted variables, "selection effects") and other IV applications for broken regression models    omitted vars example
2. Simultaneous equations (2SLS, IV in butter, peer aspirations, ed and fertility, Freedman), nonrecursive models   DHPanalysishandout
3. Reciprocal effects and non-recursive models in longitudinal data.   Empirical research on reciprocal effects (e.g. TV and ADHD), including cross-lagged correlation. clc slides
4. Random assignment as an Instrumental Variable (AIR paper)

Week 6 Readings
Primary Readings
Freedman, text Chap 8 (Chap 9 revised ver)
Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments. Joshua D. Angrist; Alan B. Krueger,
The Journal of Economic Perspectives Vol. 15, No. 4 (Autumn, 2001), pp. 69-85
Technical reference. Joshua D. Angrist; Guido W. Imbens; Donald B. Rubin "Identification of Causal Effects Using Instrumental Variables"
Journal of the American Statistical Association, Vol. 91, No. 434. (Jun., 1996), pp. 444-455. JStor note: compliance discussion for week 7

Additional resources
Rindfus example (Freedman Chap 8; paper reprinted in Freedman text). Education and Fertility: Implications for the Roles Women Occupy Ronald R. Rindfuss; Larry Bumpass; Craig St. John American Sociological Review, Vol. 45, No. 3. (Jun., 1980), pp. 431-447.   from Jstor
Instrumental variables, Epidemiology exposition:   An introduction to instrumental variables for epidemiologists, Sander Greenland, International Journal of Epidemiology 2000;29:722-729 note: compliance discussion for week 7
For Lab 3. Two-stage Least Squares in R (tsls in sem package) by John Fox. Alternatives AER: Applied Econometrics with R    systemfit)
Structural Equation Models Appendix to An R and S-PLUS Companion to Applied Regression John Fox January 2002 Structural Equation Modeling With the sem Package in R John Fox STRUCTURAL EQUATION MODELING,13(3),465–486     Jox Fox home page
Peer Influences on Aspirations: A Reinterpretation Otis Dudley Duncan, Archibald O. Haller, Alejandro Portes American Journal of Sociology, Vol. 74, No. 2 (Sep., 1968), pp. 119-137   Jstor
Fox, J. (1979) Simultaneous equation models and two-stage least-squares. In Schuessler, K. F. (ed.) Sociological Methodology 1979, Jossey-Bass. Jstor
R-package psych has front-end for sem   Also, brand new is lavaan: an R package for structural equation modeling and more
Application of instrumental variables:
Course case study Does Television Cause Autism? and should instrumental variables (IV) provide the answer? Is Rain the magic IV?
A cautionary comment, including by Nobel-laureate Jim Heckman
Economists' Full paper: Does Television Cause Autism?
Now it's rainfall.    Autism Prevalence and Precipitation Rates in California, Oregon, and Washington Counties Michael Waldman; Sean Nicholson; Nodir Adilov; John Williams Arch Pediatr Adolesc Med. 2008;162(11):1026-1034.
Other applications.   The Effect of File Sharing on Record Sales An Empirical Analysis      Effect of job training programs

Reciprocal effects: Rogosa, D. R. (1980). A critique of cross-lagged correlation. Psychological Bulletin, 88, 245-258. APA site version
Granger Causality. Nobel 2003. Complete Granger
Relationships--and the Lack Thereof--Between Economic Time Series, with Special Reference to Money and Interest Rates. David A. Pierce Journal of the American Statistical Association, Vol. 72, No. 357. (Mar., 1977), pp. 11-26. Jstor


Week 7.-- Compliance and experimental protocols; encouragement designs; intent to treat

note: Week 7 will start (and finish) 2/20; 2/18 lecture will complete Week 6 materials.
In the news
New study adds to evidence that mammograms do not save lives

Lecture topics
1. Compliance background: Intent-to-treat analyses,
2. Compliance and Dose-response data analysis (Efron-Feldman)
3. Rubin-Holland approach via Booil Jo presentation: Potential Outcomes Approach: A Brief Introduction

Week 7 Readings
Primary Readings
Compliance Background: Intent-to-Treat (ITT), the FDA mandate.    simple definitions: wiki    Encyclopedia of epidemiology, Volume 1  
Potential outcomes formulation (CACE): Causal Effects in Clinical and Epidemiological Studies Via Potential Outcomes: Concepts and Analytical Approaches Roderick J. Little and and Donald B. Rubin Vol. Annual Review of Public Health, 21: 121-145
Intent-to-treat Analysis of Randomized Clinical Trials Michael P. LaValley Boston University ACR/ARHP Annual Scientific Meeting Orlando 10/27/2003
Epidemiology exposition:   An introduction to instrumental variables for epidemiologists, Sander Greenland, International Journal of Epidemiology 2000;29:722-729
Intention to treat--who should use ITT? J. A. Lewis and D. Machin Br J Cancer. 1993 October; 68(4): 647-650.   

Additional resources
David Freedman on Compliance Adjustments:
Statistical Models for Causation: What Inferential Leverage Do They Provide?  Evaluation Review 2006; 30: 691-713.
On regression adjustments to experimental data  Advances in Applied Mathematics vol. 40 (2008) pp. 180-93.

Intent-to-treat Analysis of Randomized Clinical Trials Michael P. LaValley Boston University ACR/ARHP Annual Scientific Meeting Orlando 10/27/2003
Compliance as an Explanatory Variable in Clinical Trials. B. Efron; D. Feldman Journal of the American Statistical Association, Vol. 86, No. 413. (Mar., 1991), pp. 9-17. Jstor
What is meant by intention to treat analysis? Survey of published randomised controlled trials Sally Hollis and Fiona Campbell British Medical Journal 1999;319;670-674
Booil Jo, Dept of Psychiatry   Estimation of Intervention Effects with Noncompliance Journal of Educational and Behavioral Statistics
   Compliance Publications based on Neyman-Rubin causal models:
Direct and Indirect Causal Effects via Potential Outcomes Donald B. Rubin Scandinavian Journal of Statistics Volume 31, Issue 2, Page 161-170, Jun 2004 .
Imbens GW and Rubin DB (1997) Bayesian Inference for Causal Effects in Randomized Experiments with Noncompliance The Annals of Statistics, 25, 305-327.
Principal Stratification in Causal Inference  Constantine E. Frangakis and Donald B. Rubin, Biometrics, 2002, 58, 21–29.
Addressing Complications of Intention-to-Treat Analysis in the Combined Presence of All-or-None Treatment-Noncompliance and Subsequent Missing Outcomes. Constantine E. Frangakis; Donald B. Rubin Biometrika, Vol. 86, No. 2. (Jun., 1999), pp. 365-379. Jstor link
    Additional Case Studies
Principal Stratification Approach to Broken Randomized Experiments: A Case Study of School Choice Vouchers in New York City Barnard, Frangakis, Hill, and Rubin Journal of the American Statistical Association June 2003, Vol. 98, No. 462, Applications and Case Studies
The British Journal of Psychiatry (2003) 183: 323-331 Estimating psychological treatment effects from a randomised controlled trial with both non-compliance and loss to follow-up graham dunn, and mohammad maracy
Battistin, E. and Rettore, E. (2002). "Testing for Programme Effects in a Regression Discontinuity Design with Imperfect Compliance." Journal of the Royal Statistical Society A, 165(1), 39-57.


Week 8.-- Matching and propensity score methods

In the news
1. Too Much Sitting After 60 May Lead to Disability; For each extra sedentary hour per day, researchers found a 50 percent increased risk
2. Selenium, Vitamin E Supplements May Double Prostate Cancer Risk        Prostate cancer chances rise with vitamin E, selenium supplements; No benefits to any men from either selenium or vitamin E supplements in large U.S. trial


Lecture topics
1. Traditional matching methods: pair matching, Mahalanobis distance. Matching for increased precision or bias-reduction. Case-control studies. Modern Implementations of matching methods (also Lab 4). Ben Hansen matching exs using MatchIt/optmatch
2. The advent/onslaught of propensity score matching methodology for treatment-control comparisons

Week 8 Readings
Primary Readings
Non-technical overview      Donald Rubin Nonrandomized Comparative Clinical Studies   another version, Annals of Internal Medicine, 1997, 15 October 1997, Vol. 127. No. 8_Part_2
Joffe, Marshall M. and Paul R. Rosenbaum. 1999. "Invited Commentary: Propensity Scores." American Journal of Epidemiology 150(4):327-33.
Rosenbaum and Rubin, Reducing Bias in Observational Studies Using Subclassification on the Propensity Score, JASA 79[387], September 1984, 516-524. JStor  [one of the original technical papers]
MB sec 13.2 "Propensity scores in regression" uses NSW, PSID data, lab4

Additional resources
Talks and tutorials
Strategies for Using Propensity Scores Well.  A Workshop given by Thomas E. Love, Ph. D., Case Western Reserve University      Love workshop ASA
A broad review of matching and bias-reduction methods. Opiates for the Matches: Matching Methods for Causal Inference Jasjeet S. Sekhon. Annual Review of Political Science 2009
UNC, Chapel Hill Social Work: Introduction to Propensity Score Matching: A Review and Illustration     Propensity Score Matching: A New Device for Program Evaluation  UNC, Chapel Hill Social Work 2004     flash version
cute UW cheat sheet with links
An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies Peter C. Austin Multivariate Behav Res. 2011 May; 46(3): 399-424.
Methods to assess intended effects of drug treatment in observational studies are reviewed  Journal of Clinical Epidemiology 57(2004)1223-1231 [an overview of many of past weeks topics]
Average causal effects from nonrandomized studies: A practical guide and simulated example. Schafer, Joseph L.; Kang, Joseph Psychological Methods, Vol 13(4), Dec 2008, 279-313.
A Primer for Applying Propensity-Score Matching Office of Strategic Planning and Development Effectiveness, Inter-American Development Bank
Tutorial in biostatistics: Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group   Statist. Med. 17, 2265Ð2281 (1998)

R packages and examples:
1. Ben Hansen (local hero)   optmatch manual (11/1/13)     R News Oct 2007      additional resources, Optimal matching   Hansen presentation: Flexible, Optimal Matching for Comparative Studies Using the optmatch package
Optmatch application paper: Full matching in an observational study of coaching for the SAT.(Scholastic Assessment Test) Journal of the American Statistical Association; 9/1/2004; Hansen, Ben B.
Additional exercises (checking balance) using the nuclearplants data (class handout ex) from Mark Fredrickson 1 2
2. MatchIt: Nonparametric Preprocessing for Parametric Casual Inference Daniel Ho, Kosuke Imai, Gary King, Elizabeth Stuart MatchIt provides a wrapper that can call optmatch or Sekhon's genetic matching]
JSS May 2011 exposition: MatchIt: Nonparametric Preprocessing for Parametric Causal Inference   more R-fun from Gary King, WhatIf: Software for Evaluating Counterfactuals
Another application (including matchit): Attributing Effects to a Get-Out-The-Vote Campaign Using Full Matching and Randomization Inference Jake Bowers and Ben Hansen.    Data archive and computing resources for the New Haven get-out-the-vote
Also:
3. Multivariate and Propensity Score Matching Software for Causal Inference Jasjeet S. Sekhon

    Propensity etc Original Technical Publications [jstor links]
Rosenbaum, P. R. And D. B. Rubin, 1983, The Central Role of the Propensity Score in Observational Studies for Causal Effects, Biometrika 70[1], April 1983, 41-55. JStor
P. Rosenbaum, Chapters 2 and 3 (on exact inference for treatment effects) in Observational Studies, New York: Springer, 1995.
Dropping out of High School in the United States: An Observational Study Paul R. Rosenbaum Journal of Educational Statistics, Vol. 11, No. 3. (Autumn, 1986), pp. 207-224.  Jstor
Paul R. Rosenbaum; Donald B. Rubin. "Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score" The American Statistician, Vol. 39, No. 1. (Feb., 1985), pp. 33-38   JStor
D. Rubin, Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies, Statistical Science 5[4], November 1990, 472-480. JStor
Rubin, D. B., 1974, Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies, Journal of Educational Psychology, 66, 688-701.
Rubin, D. B., 1978, Bayesian Inference for Causal Effects: The Role of Randomization,” Annals of Statistics 6[1], January 1978, 34-58. JStor


Week 9. Longitudinal (esp time-1, time-2) data analysis for experimental and non-experimental designs.
Correlates of Change, Lord's paradox, Repeated Measures Analysis of Variance (via lmer), Cross-over Designs, etc

note: Week 9 materials start 3/6 class
In the news
Does Breastfeeding Really Make Kids Smarter?     Breast-feeding Benefits Appear to be Overstated, According to Study of Siblings     Breast-feeding benefits overstated?

Lecture topics
1. Cross-over designs. Laird-Ware text Lecture Slides (pdf pages 135-150). Crossover design data from slide 137, anova for crossover design ex
2. Correlates and predictors of change: time-1,time-2 data   data example for handout   data analysis
3. Comparing groups on time-1, time-2 data: repeated measures anova etc
   a. experimental designs   urea synthesis, BK data (wide-form)     Stat141 analysis     data, long-form,    lmer for BK repeated measures analysis     archival example analyses. SAS and minitab
   b. Observational studies.
     i. Lord's paradox and revisiting regression adjustments for pre-post designs
     ii.Economist's differences in differences (or diffs in diffs with matching) for observational studies.
4. Special topics
  a. Interrupted Time-series designs Gene Glass overview
  b. Reciprocal effects
  c. Current implementations of value-added analysis

Week 9 Readings
Primary Readings
Wainer and Brown Three Statistical Paradoxes in the Interpretation of Group Differences: Illustrated with Medical School Admission and Licensing Data (esp section 5) American Statistician, 2005.
Comparative Analyses of Pretest-Posttest Research Designs, Donna R. Brogan; Michael H. Kutner, The American Statistician, Vol. 34, No. 4. (Nov., 1980), pp. 229-232.   JSTOR link
Don Rubin on value-added and Lord's paradox: A Potential Outcomes View of Value-Added Assessment in Education Donald B. Rubin, Elizabeth A. Stuart, and Elaine L. Zanutto, Journal of Educational and Behavioral Statistics
MB section 10.5, "Repeated measurements in time"; MB Chap 9 "Time series models"   online version, Ch. 10

Additional resources
1. Repeated measures analysis of variance
Models for Pretest-Posttest Data: Repeated Measures ANOVA Revisited Earl Jennings Journal of Educational Statistics, Vol. 13, No. 3. (Autumn, 1988), pp. 273-280.  Jstor
A good R-primer on repeated measures (a lots else). Notes on the use of R for psychology experiments and questionnaires Jonathan Baron, Yuelin Li.   Another version
Multilevel package   has behavioral scienes applications including estimates of within-group agreement, and routines using random group resampling (RGR) to detect group effects.

2. Lord's Paradox, pre-post group comparisons.
Lord, F. M. (1967). A paradox in the interpretation of group comparisons. Psychological Bulletin, 68, 304-305.L
Wainer, H. (1991). Adjusting for differential base rates: Lord's Paradox again. Psychological Bulletin, 109, 147-151.
or Wainer and Brown Three Statistical Paradoxes in the Interpretation of Group Differences: Illustrated with Medical School Admission and Licensing Data
a quick low-level read: Lord's Paradox and the Assessment of Change During College    Journal of College Student Development, May/Jun 2004 by Pike, Gary R
Another time1-time2 reading covering old-fashioned ground including Lord's paradox. Maris, Eric. (1998). Covariance Adjustment Versus Gain Scores--Revisited. Psychological Methods, 3(3) 309-327. apa link  

3. Value-added analysis.
Value-added does New York City. New York schools release 'value added' teacher rankings     Formula uncovers the 'value added'    from the unions: THIS IS NO WAY TO RATE A TEACHER
Chap 9 in Uneducated Guesses: Using Evidence to Uncover Misguided Education Policies.   Howard Wainer (Author) amazon page    available in paper and Kindle
Other versions of the Chap 9 materials Value-Added Models to Evaluate Teachers: A Cry For Help H Wainer, Chance, 2011.         Journal of Consumer Research Vol. 32, No. 2, Sept 2005
More Value-added analysis. Journal of Educational and Behavioral Statistics Vol. 29, No. 1, Spring, 2004 Value-Added Assessment Special Issue
Value-Added Measures of Education Performance: Clearing Away the Smoke and Mirrors, PACE
LA Times Teacher Ratings, summer 2010        NEPC vs LATimes
J.R. Lockwood, Harold Doran, and Daniel F. McCaffrey. Using R for estimating longitudinal student achievement models. R News, 3(3):17-23, December 2003.
Fitting Value-Added Models in R  Harold C. Doran and J.R. Lockwood
Andrew Gelman on Value-added arithmetic: It's no fun being graded on a curve     more NY  Principals rebel against 'value-added' evaluation

4. Interrupted time-series
Interrupted Time Series Quasi-Experiments Gene V Glass Arizona State University
Did fertility go up after the Oklahoma City bombing? An analysis of births in metropolitan counties in Oklahoma, 1990-1999. Demography, 2005.
original publication (ozone data): Box, G. E. P. and G. C. Tiao. 1975. Intervention Analysis with Applications to Economic and Environmental Problems." Journal of the American Statistical Association. 70:70-79. SAS example for ozone data     another ozone analysis with data
Box-tiao time series models for impact assessment Evaluation Quarterly 1979
Interrupted time-series analysis and its application to behavioral data Donald P. Hartmann, John M. Gottman, Richard R. Jones, William Gardner, Alan E. Kazdin, and Russell S. Vaught J Appl Behav Anal. 1980 Winter; 13(4): 543-559.
Segmented regression analysis of interrupted time series studies in medication use research. By: Wagner, A. K.; Soumerai, S. B.; Zhang, F.; Ross-Degnan, D.. Journal of Clinical Pharmacy & Therapeutics, Aug2002, Vol. 27 Issue 4, p299-309,
Interrupted Time Series Designs In Health Technology Assessment: Lessons From Two Systematic Reviews Of Behavior Change Strategies Craig R. Ramsay University Of Aberdeen, International Journal Of Technology Assessment In Health Care, 19:4 (2003), 613-623.

5. Measurement of Change, Correlates of Change, Growth Curve Analysis.
Rogosa, D. R., & Willett, J. B. (1985). Understanding correlates of change by modeling individual differences in growth. Psychometrika, 50, 203-228. available from John Willet's pub page
A growth curve approach to the measurement of change. Rogosa, David; Brandt, David; Zimowski, Michele Psychological Bulletin. 1982 Nov Vol 92(3) 726-748 APA record   direct link
Longitudinal Data Analysis Examples with Random Coefficient Models. David Rogosa; Hilary Saner . Journal of Educational and Behavioral Statistics, Vol. 20, No. 2, Special Issue: Hierarchical Linear Models: Problems and Prospects. (Summer, 1995), pp. 149-170. Jstor
Demonstrating the Reliability of the Difference Score in the Measurement of Change. David R. Rogosa; John B. Willett Journal of Educational Measurement, Vol. 20, No. 4. (Winter, 1983), pp. 335-343. Jstor



Dead Week

Tuesday 3/11
Complete Longitudinal data (week 9 topics): group comparisons: experiments and observational studies; special topics

Collection of scanned course handouts, weeks 1-9.
    note:many handout examples have extended online versions

Thursday 3/13.
Collect TH2 papers; distribute CD of Class Lectures
HW9 run through
Course retrospective, Q+A including Exam 3 (3/19 in Seqouia 200)
Student research topics and questions

Final Pass: Courses, Talks and Papers covering Stat209 content:
Dylan Small, Stanford Stat Ph.D. Stat921 at Wharton
Christopher Winship Sociology 203a, Harvard
ECONOMICS 452* Applied Econometrics Economics Department at Queen's University, Canada
A broad review of matching and bias-reduction methods. Opiates for the Matches: Matching Methods for Causal Inference Jasjeet S. Sekhon. Annual Review of Political Science 2009
Average causal effects from nonrandomized studies: A practical guide and simulated example. Schafer, Joseph L.; Kang, Joseph Psychological Methods, Vol 13(4), Dec 2008, 279-313.
The Foundations of Causal Inference Judea Pearl. Sociological Methodology, upcoming.
Experiments and Observational Studies: Causal Inference in Statistics Paul R. Rosenbaum
For Objective Causal Inference, Design Trumps Analysis Don Rubin, Harvard
Causal inference with observational data A brief review of quasi-experimental methods Austin Nichols July 30, 2009
Propensity Score Analysis and Strategies for Its Application to Services Training Evaluation. Shenyang Guo, Ph.D. School of Social Work University of North Carolina at Chapel Hill June 14, 2011  another version