Project 3, Final Paper:  Data Analysis.

 

After you've done the first data analysis problem set, you should be ready to tackle some simple data analysis questions of your own.

 

Project 3 homework, due Monday May 17 in class

 

Project 3 proposal, 1-2 pages, due Wednesday May 19 in class

 

Each student will submit a separate proposal

 

1) Your name

2) Who if anyone, you are working with

3) TA's name

4) Proposal contains a simple question you would like to test with the March, 2000 CPS.

5) Which variables in the dataset you will use.  One very important thing to remember is that the CPS has hundreds of variables, but you only have access to the 39 variables in the dataset I have provided.  The best way to familiarize yourself with the variables you haven't used in the homework so far is to use the STATA command describe, and then to study the documentation variable by variable.

6) Provide a couple of sentences of an explanation of which STATA commands you will used to test your question.

 

 

Project 3 Final Paper: 6-8 pages PLUS STATA output, due Wednesday, June 2 in class.  No Late Papers Will be accepted.

1) Your name

2) Who if anyone, you are working with

3) TA's name

4) The initial hypothesis you wanted to test

5) What did you find?  How do you interpret your results?  What kind of limitations did you run in to?

6) Your paper itself should contain some data analysis, but in the paper the tables should be reformatted and edited by you.  That is, do not simply paste the STATA output into your paper.

7) As an appendix to the paper, attach the raw STATA output that you used to generate the findings in your paper.  You don't need to supply all the log files you have, just some of the important parts.

 

Papers will be evaluated on the author's ability to correctly interpret tabular and simple statistical output, and on the author's creativity and ability to construct appropriate tests for their hypotheses.

 

Further Notes on the Final Paper

1) The most important thing to think about when doing your paper is the simple procedures we have expored in the homework for controling for demographic variables.  That is, if you are comparing veterans to non-veterans in terms of earnings (as we did in question 4 of the homework), you need to control for gender (since most veterans are male) and you need to control for age (since most veterans are older).  We could also have controled for US nativity.  Whatever comparison you are thinking of making, try to imagine how race, gender, age, education, occupation, place of birth, urban or rural residence, income, poverty status, marital status, occupation, or industry might effect your question.  Then think of ways, using the commands we have discussed in class, to 'control' for the demographic variables that are most relevant for your question.

 

2) Your paper need not contain any literature review.  You should focus on the empirical question at hand, and on studying the data, and interpreting the output.

 

3) The CPS contains hundreds of variables, but for your papers you will only be able to use the 30 or so available in the dataset you have already used.  Be creative in thinking about how to use the available questions to try to answer the questions you are interested in.

 

4) If you're in doubt about what the variables mean, consult the documentation.

 

5) Use the weighted data to make assertions about the US, but check the unweighted data to make sure you have enough cases in the dataset for you comparisons.  How many is enough?  Fewer than 30 cases is problematic.  Fewer than 10 cases means you need to broaden your question.  If you want to work with occupations, pick occupations that have a lot of cases (lawyers, nurses, teachers, laborers) rather than small occupations like astronauts or federal judges.