. *Class begins here

. use "C:\Users\Michael\Desktop\cps_mar_2000_new_unchanged.dta", clear

. ttest yrsed if age>=25 & age<=34, by(sex)

Two-sample t test with equal variances

------------------------------------------------------------------------------

Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

*I said at the end of last class that the tail probability associated with a T-statistic of -5.7 was tiny, here is the actual cumulative tail probability:

. display ttail(18536, 5.7164)

5.524e-09

* Notice in the above command that I used a positive 5.7 value, that is because Stata’s ttail command is designed to give the right-hand, or upper tail cumulative probability. This is the same as 1- the left hand tail probability:

. display 1- ttail(18536, -5.7164)

5.524e-09

* Usually we take the tail probability and multiply by two, which is chance that we would get a value this high (for the difference between women’s and men’s education) if the true value of the difference were zero:

. display 2*ttail(18536, 5.7164)

1.105e-08

* Since this two-tailed test yields a tiny probability of 1 in 100,000,000, we can reject the null hypothesis that young men and young women in the US have the same educational attainment.

. summarize incwelfr

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |    103226    40.62242    478.8231          0      25000

. summarize incwelfr if age>=15 & incwelfr>0 & incwelfr~=.

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |      1289    3253.134    2813.505          1      25000

. summarize incwelfr if age>=15 & incwelfr>0 & incwelfr~=. [fweight= perwt_rounded]

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |   2551246    3072.095    2803.442          1      25000

* Use the “if” to specify subsets of the data that are relevant to the question. Use fweight to get number of observations in the US. There are 1289 adults I the CPS who reported positive welfare income in 1999; this corresponds to 2.55 million adults in the US.

* Now, generate a new variable and add value and variable labels.

. gen byte receives_welfare=0

. replace receives_welfare=1 if incwelfr>0 & incwelfr~=.

(1289 real changes made)

lfare |      Freq.     Percent        Cum.

------------+-----------------------------------

0 |    132,421       99.04       99.04

1 |      1,289        0.96      100.00

------------+-----------------------------------

Total |    133,710      100.00

. label define receives_welfare_lbl 0 "no" 1 "yes"

lfare |      Freq.     Percent        Cum.

------------+-----------------------------------

no |    132,421       99.04       99.04

yes |      1,289        0.96      100.00

------------+-----------------------------------

Total |    133,710      100.00

. label var receives_welfare "does respondent receive welfare"

does |

respondent |

welfare |      Freq.     Percent        Cum.

------------+-----------------------------------

no |    132,421       99.04       99.04

yes |      1,289        0.96      100.00

------------+-----------------------------------

Total |    133,710      100.00

* One thing you never want to do is tabulate a continuous variable. The result is output like the phone book. Hit the interrupt button.

. tabulate incwage

Wage and |

salary |

income |      Freq.     Percent        Cum.

------------+-----------------------------------

0 |     35,825       34.71       34.71

1 |          7        0.01       34.71

5 |         15        0.01       34.73

7 |          1        0.00       34.73

8 |          1        0.00       34.73

10 |          1        0.00       34.73

12 |          2        0.00       34.73

18 |          1        0.00       34.73

20 |         10        0.01       34.74

21 |          2        0.00       34.74

28 |          2        0.00       34.75

30 |          5        0.00       34.75

31 |          1        0.00       34.75

34 |          4        0.00       34.76

35 |          5        0.00       34.76

36 |          1        0.00       34.76

40 |          8        0.01       34.77

44 |          1        0.00       34.77

45 |          4        0.00       34.77

46 |          3        0.00       34.78

--Break--

r(1);

. table  educrec sex if age>20  [fweight= perwt_rounded], contents(freq mean  incwelfr mean  receives_welfare) row col

---------------------------------------------------------------

Educational attainment  |                  Sex

recode                  |        Male       Female        Total

------------------------+--------------------------------------

None or preschool |     409,822      463,962      873,784

|           0  201.8166229  107.1606301

|           0       .04848      .025742

|

Grades 1, 2, 3, or 4 |     988,458      959,869      1948327

|  21.2051377  186.4335592  102.6070993

|     .011155      .039831      .025283

|

Grades 5, 6, 7, or 8 |     4792742      5028804      9821546

| 10.72959028  119.6578288  66.50276097

|     .005356      .032857      .019437

|

Grade 9 |     1926372      2028431      3954803

| 20.88420617  134.0259969  78.91498944

|     .007086      .046607      .027357

|

Grade 10 |     2498378      2892776      5391154

| 22.49344775   214.192737  125.3551177

|     .008675        .0635      .038093

|

Grade 11 |     2607008      3013104      5620112

| 23.15243145  216.6690639  126.9022747

|     .007129      .073434      .042677

|

Grade 12 |    3.01e+07     3.47e+07     6.48e+07

| 11.72673341  67.85343211   41.8001018

|     .003832      .021749      .013432

|

1 to 3 years of college |    2.35e+07     2.70e+07     5.05e+07

| 7.269855825  44.67187372  27.25585651

|     .002034       .01304      .007915

|

4+ years of college |    2.40e+07     2.28e+07     4.68e+07

| .3599692853   5.49143018  2.858781299

|     .000103      .002347      .001196

|

Total |    9.09e+07     9.89e+07     1.90e+08

| 8.382322858  61.73025686  36.18842854

|      .00282       .01907       .01129

---------------------------------------------------------------

* Table is a useful command for creating tables of statistics (in this case the proportion who receive welfare) by other variables (in this case, education and gender).

. clear all

*No on to data ingestion. After you have downloaded and unzipped the data file, and downloaded the stata command file (with .do extension), you need to take the directory path of the directory with the data and do file, and make that the working directory for stata

. cd "C:\Users\Michael\Documents\current class files\intro soc methods\2005 data again"

C:\Users\Michael\Documents\current class files\intro soc methods\2005 data again

* Then, (this is easiest using the File>do command in the stata menu system), you run the do file.

. do "C:\Users\Michael\Documents\current class files\intro soc methods\2005 data again\cps_00010.do"

. * NOTE: You need to set the Stata working directory to the path

. * where the data file is located.

.

. set more off

.

. clear

. quietly infix             ///

>   int     year     1-4    ///

>   long    serial   5-9    ///

>   float   hwtsupp  10-19  ///

>   byte    month    20-21  ///

>   float   wtsupp   22-31  ///

>   float   wtfinl   32-41  ///

>   byte    age      42-43  ///

>   byte    sex      44-44  ///

>   double  inctot   45-52  ///

>   using `"cps_00010.dat"'

.

. replace hwtsupp = hwtsupp / 10000

(210648 real changes made)

. replace wtsupp  = wtsupp  / 10000

(210648 real changes made)

. replace wtfinl  = wtfinl  / 10000

(0 real changes made)

.

. format hwtsupp %10.4f

. format wtsupp  %10.4f

. format wtfinl  %10.4f

. format inctot  %8.0f

.

. label var year    `"Survey year"'

. label var serial  `"Household serial number"'

. label var hwtsupp `"Household weight, Supplement"'

. label var month   `"Month"'

. label var wtsupp  `"Supplement Weight"'

. label var wtfinl  `"Final Basic Weight"'

. label var age     `"Age"'

. label var sex     `"Sex"'

. label var inctot  `"Total personal income"'

.

. label define hwtsupp_lbl 0000000000 `"0000000000"'

. label values hwtsupp hwtsupp_lbl

.

. label define month_lbl 01 `"January"'

. label define month_lbl 02 `"February"', add

. label define month_lbl 03 `"March"', add

. label define month_lbl 04 `"April"', add

. label define month_lbl 05 `"May"', add

. label define month_lbl 06 `"June"', add

. label define month_lbl 07 `"July"', add

. label define month_lbl 08 `"August"', add

. label define month_lbl 09 `"September"', add

. label define month_lbl 10 `"October"', add

. label define month_lbl 11 `"November"', add

. label define month_lbl 12 `"December"', add

. label values month month_lbl

.

. label define wtfinl_lbl 0000000000 `"0"'

. label values wtfinl wtfinl_lbl

.

. label define age_lbl 00 `"Under 1 year"'

. label define age_lbl 01 `"1"', add

. label define age_lbl 02 `"2"', add

. label define age_lbl 03 `"3"', add

. label define age_lbl 04 `"4"', add

. label define age_lbl 05 `"5"', add

. label define age_lbl 06 `"6"', add

. label define age_lbl 07 `"7"', add

. label define age_lbl 08 `"8"', add

. label define age_lbl 09 `"9"', add

. label define age_lbl 10 `"10"', add

. label define age_lbl 11 `"11"', add

. label define age_lbl 12 `"12"', add

. label define age_lbl 13 `"13"', add

. label define age_lbl 14 `"14"', add

. label define age_lbl 15 `"15"', add

. label define age_lbl 16 `"16"', add

. label define age_lbl 17 `"17"', add

. label define age_lbl 18 `"18"', add

. label define age_lbl 19 `"19"', add

. label define age_lbl 20 `"20"', add

. label define age_lbl 21 `"21"', add

. label define age_lbl 22 `"22"', add

. label define age_lbl 23 `"23"', add

. label define age_lbl 24 `"24"', add

. label define age_lbl 25 `"25"', add

. label define age_lbl 26 `"26"', add

. label define age_lbl 27 `"27"', add

. label define age_lbl 28 `"28"', add

. label define age_lbl 29 `"29"', add

. label define age_lbl 30 `"30"', add

. label define age_lbl 31 `"31"', add

. label define age_lbl 32 `"32"', add

. label define age_lbl 33 `"33"', add

. label define age_lbl 34 `"34"', add

. label define age_lbl 35 `"35"', add

. label define age_lbl 36 `"36"', add

. label define age_lbl 37 `"37"', add

. label define age_lbl 38 `"38"', add

. label define age_lbl 39 `"39"', add

. label define age_lbl 40 `"40"', add

. label define age_lbl 41 `"41"', add

. label define age_lbl 42 `"42"', add

. label define age_lbl 43 `"43"', add

. label define age_lbl 44 `"44"', add

. label define age_lbl 45 `"45"', add

. label define age_lbl 46 `"46"', add

. label define age_lbl 47 `"47"', add

. label define age_lbl 48 `"48"', add

. label define age_lbl 49 `"49"', add

. label define age_lbl 50 `"50"', add

. label define age_lbl 51 `"51"', add

. label define age_lbl 52 `"52"', add

. label define age_lbl 53 `"53"', add

. label define age_lbl 54 `"54"', add

. label define age_lbl 55 `"55"', add

. label define age_lbl 56 `"56"', add

. label define age_lbl 57 `"57"', add

. label define age_lbl 58 `"58"', add

. label define age_lbl 59 `"59"', add

. label define age_lbl 60 `"60"', add

. label define age_lbl 61 `"61"', add

. label define age_lbl 62 `"62"', add

. label define age_lbl 63 `"63"', add

. label define age_lbl 64 `"64"', add

. label define age_lbl 65 `"65"', add

. label define age_lbl 66 `"66"', add

. label define age_lbl 67 `"67"', add

. label define age_lbl 68 `"68"', add

. label define age_lbl 69 `"69"', add

. label define age_lbl 70 `"70"', add

. label define age_lbl 71 `"71"', add

. label define age_lbl 72 `"72"', add

. label define age_lbl 73 `"73"', add

. label define age_lbl 74 `"74"', add

. label define age_lbl 75 `"75"', add

. label define age_lbl 76 `"76"', add

. label define age_lbl 77 `"77"', add

. label define age_lbl 78 `"78"', add

. label define age_lbl 79 `"79"', add

. label define age_lbl 80 `"80"', add

. label define age_lbl 81 `"81"', add

. label define age_lbl 82 `"82"', add

. label define age_lbl 83 `"83"', add

. label define age_lbl 84 `"84"', add

. label define age_lbl 85 `"85"', add

. label define age_lbl 86 `"86"', add

. label define age_lbl 87 `"87"', add

. label define age_lbl 88 `"88"', add

. label define age_lbl 89 `"89"', add

. label define age_lbl 90 `"90 (90+, 1988-2002)"', add

. label define age_lbl 91 `"91"', add

. label define age_lbl 92 `"92"', add

. label define age_lbl 93 `"93"', add

. label define age_lbl 94 `"94"', add

. label define age_lbl 95 `"95"', add

. label define age_lbl 96 `"96"', add

. label define age_lbl 97 `"97"', add

. label define age_lbl 98 `"98"', add

. label define age_lbl 99 `"99+"', add

. label values age age_lbl

.

. label define sex_lbl 1 `"Male"'

. label define sex_lbl 2 `"Female"', add

. label define sex_lbl 9 `"NIU"', add

. label values sex sex_lbl

.

. label define inctot_lbl 00999997 `"00999997"'

. label define inctot_lbl 99999997 `"99999997"', add

. label define inctot_lbl 99999999 `"99999999"', add

. label values inctot inctot_lbl

*Don’t forget to save your new stata file!

.

.

.

end of do-file

. save "C:\Users\Michael\Documents\current class files\intro soc methods\2005 data

> again\2005 cps data.dta"

file C:\Users\Michael\Documents\current class files\intro soc methods\2005 data aga

> in\2005 cps data.dta saved

. log close

name:  <unnamed>

log:  C:\Users\Michael\Documents\newer web pages\soc_meth_proj3\fall_2014_lo

> gs\class2.log

log type:  text

closed on:  24 Sep 2014, 12:33:06

-----------------------------------------------------------------------------------