Introduction to Data Analysis†††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††† Rev: 3/29/2019

Sociology 180B/280B


Draft Syllabus


Class Tuesday+ Thursday, 3P-4:20P


Lab/Section once a week time

1) Fridays 10:30A-11:50A


2) Thursdays 6P-7:20P


Michael J. Rosenfeld


Department of Sociology

Building 120 room 124

The class website is my personal Stanford website

Office Hours TBA



Amy Johnson (

Michael Hahn (


Use Canvas to submit homework





††††††††††† This class will cover basic statistics including regression, how do statistical analysis, and how to find flaws and problems with statistical analyses.

††††††††††† In the process of learning about data analysis you will also learn about demography and stratification in the U.S., because the dataset is the Current Population Survey of March, 2000, which is a nationally representative survey of more than 60,000 households, with lots of information about race, gender, income, occupation, place of residence, and so on.† You'll also learn how to use one of the most powerful and flexible tools for data analysis, the statistical software STATA.†


Readings and Grading Policy


Books (available at Stanford Bookstore):

* Freedman, David, Robert Pisani, and Roger Purves. 2007. Statistics. Fourth Edition. W.W. Norton. $105, ISBN: 0393929728 (recommended). If you know a little about statistics already, or if you have taken one statistics class like Stats 60, you donít need to buy the Freedman, and you can ignore the Freedman reading assignments.

* Tufte, Edward. 2001. The Visual Display of Quantitative Information. Graphics Press. $28,  0961392142 (required).


Other readings will be linked from the class website.



Software Required (order online)

* Intercooled (IC) Stata, Version 15. You may purchase either a 1 year license for $125,or a perpetual license for $225. I recommend the perpetual license so that you can use this software in the future. The software comes with a small introduction to Stata book. Donít bother buying Stataís massive printed reference book collection. I will teach you the Stata commands that you need to know, and the Stata online help is very good.

There are computer clusters at Stanford where you can run Stata for free, and you can run Stata over Unix but with reduced screen feedback. I strongly urge you to buy the Stata license and install it on your personal PC.



Computer Use Policy:

* Computer use by students in class is strictly limited to following along with the data analysis examples being presented by the professor.



1) Undergraduates, Soc 180B:


4 homeworks, 15% each

Regular section participation


Final exam (based on data analysis part of the course)




2) Graduate Students Soc 280B


4 homeworks, 15% each

Regular section participation


In-class presentation (data analysis of dataset of your own choosing) outline

10% (due date to be negotiated with professor Rosenfeld

In-class presentation (data analysis of dataset of your own choosing) actual presentation to class

20% (class presentation date to be negotiated with professor Rosenfeld)



Project and Reading Assignment Timeline




Class lecture Goals




Apr 2

Introduction to the class




Apr 4

Basics of descriptive data analysis using STATA

Read my Intro to Stata (required)

Read Freedman Ch 4

Hand out HW#1



Work on HW 1 and on using STATA









Apr 9

Observational Studies and their limitations

Freedman Ch 2



Apr 11

Error and bias

Freedman Ch 6




Work on HW 1 and on using STATA


Friday, April 12, HW 1 due at midnight






Apr 16

Error and bias

Freedman Ch 6

Hand out HW#2


Apr 18

Probability sampling, Sample size and power, and standard errors

Freedman Ch 20




Stata, and HW 2









Apr 23

More on sample size and power.

Freedman Ch 21



Apr 25

Statistics and hypothesis testing





Stata, and HW 2







Friday, Apr 26, HW#2 due by midnight


Apr 30

Introduction to regression with STATA

Freedman Chs 9, 10

Hand out HW#3


May 2

More on regression with STATA, interpreting coefficients

Freedman, Ch 11, 12




Work on HW #3









May 7

Problems with and difficulties in using regression, Graphing.

Tufte, read the whole book (required)



May 9

Proper and improper presentation of data




Work on HW #3







Friday, May 10, HW#3 due by midnight


May 14

Additivity, linearity, and regression fits


Hand out HW #4


May 16

Regression analysis: residuals and outliers

Readings by Jasso and Kahn and Udry, and Jassoís response posted on my website (all required)




Work on STATA, discuss the issues in HW 4









May 21

Logistic regression



May 23

Logistic regression and the likelihood ratio test




Work on STATA, discuss the issues in CPS HW #4









May 28

Polls, polling aggregation, and election prediction




May 30

Soc 280B in-class presentations





HW #4 due

Friday, May 31 by midnight



Work on STATA, discuss the issues in HW 4









June 4

Final Exam Review



June 6

No class



no section meetings















Final exam

Saturday, June 8, 3:30P-6:30P