------------------------------------------------------------------------------------

name:  <unnamed>

log:  C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web pages\soc_meth_proj3\fall_2011_381_logs\class2.log

log type:  text

opened on:  29 Sep 2011, 11:59:43

. use "C:\Documents and Settings\Michael Rosenfeld\Desktop\cps_mar_2000_new_unch

> anged.dta", clear

. sort sex

. by sex: summarize incwelfr

--------------------------------------------------------------------------------

-> sex = Male

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |     49353    11.35025    245.3368          0      13800

--------------------------------------------------------------------------------

-> sex = Female

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |     53873    67.43862    618.6006          0      25000

*We always have to be thinking about what the appropriate population frame is. The average welfare income across all people is too broad a frame, most people have zero welfare income so the average is way low, \$11 for men.

. by sex: summarize incwelfr if incwelfr>0

--------------------------------------------------------------------------------

-> sex = Male

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |       188    2979.622    2644.509          1      13800

--------------------------------------------------------------------------------

-> sex = Female

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |      1101    3299.837    2839.866          1      25000

. by sex: summarize incwelfr if incwelfr>0 [fweight=perwt_rounded]

---------------------------------------------------------------------------------

-> sex = Male

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |    357702     2897.24    2577.316          1      13800

---------------------------------------------------------------------------------

-> sex = Female

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwelfr |   2193544    3100.608    2837.588          1      25000

. table sex if incwelfr>0 [fweight=perwt_rounded], contents (freq mean incwelfr max incwelfr)

----------------------------------------------------------

Sex |          Freq.  mean(incwelfr)   max(incwelfr)

----------+-----------------------------------------------

Male |       3.12e+07     2897.240312           13800

Female |       3.17e+07     3100.608278           25000

----------------------------------------------------------

. table sex if incwelfr>0 & incwelfr ~=. [fweight=perwt_rounded], contents (freq mean incwelfr max incwelfr)

----------------------------------------------------------

Sex |          Freq.  mean(incwelfr)   max(incwelfr)

----------+-----------------------------------------------

Male |        357,702     2897.240312           13800

Female |        2193544     3100.608278           25000

----------------------------------------------------------

* Table is a really useful and versatile command, but notice that the frequency counts didn’t match what we got with summarize until we excluded all cases with missing values (i.e. incwelfr~=.).

. by sex: summarize incwage if age >=25 & age<=34

---------------------------------------------------------------------------------

-> sex = Male

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwage |      9027    29510.62    26619.54          0     362302

---------------------------------------------------------------------------------

-> sex = Female

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

incwage |      9511    17728.95    20249.23          0     333564

. table sex if age>=25 & age<=34, contents (freq mean incwage mean yrsed)

-------------------------------------------------------

Sex |         Freq.  mean(incwage)    mean(yrsed)

----------+--------------------------------------------

Male |         9,027    29510.61781       13.31212

Female |         9,511    17728.94764       13.55657

-------------------------------------------------------

. summarize age

Variable |       Obs        Mean    Std. Dev.       Min        Max

-------------+--------------------------------------------------------

age |    133710    35.17964    22.21722          0         90

* By the way, why is the age maximum 90 years? Surely in a sample of 133K people, someone would be older than 90, right? The answer is that ages above 90 are topcoded to maintain confidentiality. You can see this if you tabulate age, or more easily by looking at the ipums documentation for variable age.

. tabulate age

Age |      Freq.     Percent        Cum.

--------------------+-----------------------------------

Under 1 year |      1,713        1.28        1.28

1 |      1,932        1.44        2.73

2 |      1,950        1.46        4.18

3 |      1,939        1.45        5.63

4 |      1,965        1.47        7.10

5 |      1,998        1.49        8.60

6 |      2,059        1.54       10.14

7 |      2,176        1.63       11.77

8 |      2,163        1.62       13.38

9 |      2,243        1.68       15.06

10 |      2,202        1.65       16.71

11 |      2,083        1.56       18.27

12 |      2,035        1.52       19.79

13 |      2,047        1.53       21.32

14 |      1,979        1.48       22.80

15 |      2,046        1.53       24.33

16 |      1,965        1.47       25.80

17 |      1,998        1.49       27.29

18 |      1,847        1.38       28.67

19 |      1,826        1.37       30.04

20 |      1,722        1.29       31.33

21 |      1,687        1.26       32.59

22 |      1,638        1.23       33.81

23 |      1,622        1.21       35.03

24 |      1,662        1.24       36.27

25 |      1,666        1.25       37.52

26 |      1,640        1.23       38.74

27 |      1,726        1.29       40.03

28 |      1,801        1.35       41.38

29 |      1,995        1.49       42.87

30 |      1,907        1.43       44.30

31 |      1,991        1.49       45.79

32 |      1,890        1.41       47.20

33 |      1,898        1.42       48.62

34 |      2,024        1.51       50.13

35 |      2,134        1.60       51.73

36 |      2,123        1.59       53.32

37 |      2,099        1.57       54.89

38 |      2,064        1.54       56.43

39 |      2,228        1.67       58.10

40 |      2,190        1.64       59.74

41 |      2,115        1.58       61.32

42 |      2,137        1.60       62.92

43 |      2,091        1.56       64.48

44 |      2,114        1.58       66.06

45 |      2,118        1.58       67.64

46 |      1,939        1.45       69.10

47 |      1,957        1.46       70.56

48 |      1,827        1.37       71.93

49 |      1,767        1.32       73.25

50 |      1,865        1.39       74.64

51 |      1,802        1.35       75.99

52 |      1,825        1.36       77.35

53 |      1,695        1.27       78.62

54 |      1,301        0.97       79.59

55 |      1,323        0.99       80.58

56 |      1,324        0.99       81.57

57 |      1,304        0.98       82.55

58 |      1,128        0.84       83.39

59 |      1,129        0.84       84.24

60 |      1,154        0.86       85.10

61 |      1,051        0.79       85.89

62 |      1,073        0.80       86.69

63 |        938        0.70       87.39

64 |        952        0.71       88.10

65 |      1,014        0.76       88.86

66 |        869        0.65       89.51

67 |        926        0.69       90.20

68 |        908        0.68       90.88

69 |        904        0.68       91.56

70 |        913        0.68       92.24

71 |        885        0.66       92.90

72 |        770        0.58       93.48

73 |        797        0.60       94.08

74 |        814        0.61       94.68

75 |        796        0.60       95.28

76 |        704        0.53       95.81

77 |        646        0.48       96.29

78 |        687        0.51       96.80

79 |        602        0.45       97.25

80 |        514        0.38       97.64

81 |        476        0.36       97.99

82 |        425        0.32       98.31

83 |        427        0.32       98.63

84 |        325        0.24       98.87

85 |        306        0.23       99.10

86 |        248        0.19       99.29

87 |        209        0.16       99.44

88 |        172        0.13       99.57

89 |        155        0.12       99.69

90 (90+, 1988-2002) |        416        0.31      100.00

--------------------+-----------------------------------

Total |    133,710      100.00

. table sex if age>=25 & age<=34, contents (freq mean incwage mean yrsed)

-------------------------------------------------------

Sex |         Freq.  mean(incwage)    mean(yrsed)

----------+--------------------------------------------

Male |         9,027    29510.61781       13.31212

Female |         9,511    17728.94764       13.55657

-------------------------------------------------------

. display 13.55657-13.31212

.24445

* display is like a little calculator command, it prints the results without altering the variables in memory at all.

. ttest yrsed if age >=25 & age<=34, by(sex)

Two-sample t test with equal variances

------------------------------------------------------------------------------

Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

* Let’s see what it looks like to generate a new 0-1 dummy variable for gender, and put that into a simple regression.

. tabulate sex

Sex |      Freq.     Percent        Cum.

------------+-----------------------------------

Male |     64,791       48.46       48.46

Female |     68,919       51.54      100.00

------------+-----------------------------------

Total |    133,710      100.00

. tabulate sex, nolab

Sex |      Freq.     Percent        Cum.

------------+-----------------------------------

1 |     64,791       48.46       48.46

2 |     68,919       51.54      100.00

------------+-----------------------------------

Total |    133,710      100.00

. generate male=0

. replace male=1 if sex==1

. label define male_lbl 0 "female" 1 "male"

. label val male male_lbl

* When generating a new variable: first generate, then replace the values until they are what you want, then create value labels, then attach those value labels to the variable.

. tabulate sex male

|         male

Sex |    female       male |     Total

-----------+----------------------+----------

Male |         0     64,791 |    64,791

Female |    68,919          0 |    68,919

-----------+----------------------+----------

Total |    68,919     64,791 |   133,710

* Then cross tabulate the old and new variables, to make sure that the new variable does what you want, and that you haven’t miscoded or left any cases as missing.

. tabulate sex male, miss

|         male

Sex |    female       male |     Total

-----------+----------------------+----------

Male |         0     64,791 |    64,791

Female |    68,919          0 |    68,919

-----------+----------------------+----------

Total |    68,919     64,791 |   133,710

*With a proper 0-1 dummy variable for gender, we can now plug it into the regression and run it. And guess what? It gives us exactly the same result as the t-test.

. regress yrsed male if age >=25 & age<=34

Source |       SS       df       MS              Number of obs =   18538

-------------+------------------------------           F(  1, 18536) =   32.68

Model |  276.742433     1  276.742433           Prob > F      =  0.0000

Residual |  156979.922 18536  8.46892111           R-squared     =  0.0018

Total |  157256.664 18537  8.48339343           Root MSE      =  2.9101

------------------------------------------------------------------------------

yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

male |  -.2444469   .0427623    -5.72   0.000    -.3282649   -.1606289

_cons |   13.55657   .0298401   454.31   0.000     13.49808    13.61506

------------------------------------------------------------------------------

. ttest yrsed if age >=25 & age<=34, by(sex)

Two-sample t test with equal variances

------------------------------------------------------------------------------

Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

* At this point, before clearing the data out of memory, if you wanted to keep the new “male” variable, you would have to save the dataset. I didn’t want to keep it, so I just cleared.

. clear

* Then I copied the folder address name where my downloaded data was store and made it the default directory using stata’s “cd” command.

. cd "C:\Documents and Settings\Michael Rosenfeld\My Documents\current class files\intro soc methods\trial cps"

C:\Documents and Settings\Michael Rosenfeld\My Documents\current class files\intro soc methods\trial cps

* Then I used the File>Do menu to run the do-file from that default directory, which automatically found the .dat file in the same directory, and then STATA was off to the races.

. do "C:\Documents and Settings\Michael Rosenfeld\My Documents\current class files\intro soc methods\trial cps\cps_00008.do"

* Since the do-file is just a text file list of STATA commands, the do-file commands appear in the results window and in the log as they are performed.

. /* Important: you need to put the .dat and .do files in one folder/

>    directory and then set the working folder to that folder. */

.

. set more off

.

. clear

. infix ///

>  int     year                                 1-4 ///

>  byte    age                                  5-6 ///

>  byte    sex                                  7 ///

>  using cps_00008.dat

.

. label var year `"Survey year"'

. label var age `"Age"'

. label var sex `"Sex"'

.

. label define agelbl 00 `"Under 1 year"'

. label define agelbl 01 `"1"', add

. label define agelbl 02 `"2"', add

. label define agelbl 03 `"3"', add

. label define agelbl 04 `"4"', add

. label define agelbl 05 `"5"', add

. label define agelbl 06 `"6"', add

. label define agelbl 07 `"7"', add

. label define agelbl 08 `"8"', add

. label define agelbl 09 `"9"', add

. label define agelbl 10 `"10"', add

. label define agelbl 11 `"11"', add

. label define agelbl 12 `"12"', add

. label define agelbl 13 `"13"', add

. label define agelbl 14 `"14"', add

. label define agelbl 15 `"15"', add

. label define agelbl 16 `"16"', add

. label define agelbl 17 `"17"', add

. label define agelbl 18 `"18"', add

. label define agelbl 19 `"19"', add

. label define agelbl 20 `"20"', add

. label define agelbl 21 `"21"', add

. label define agelbl 22 `"22"', add

. label define agelbl 23 `"23"', add

. label define agelbl 24 `"24"', add

. label define agelbl 25 `"25"', add

. label define agelbl 26 `"26"', add

. label define agelbl 27 `"27"', add

. label define agelbl 28 `"28"', add

. label define agelbl 29 `"29"', add

. label define agelbl 30 `"30"', add

. label define agelbl 31 `"31"', add

. label define agelbl 32 `"32"', add

. label define agelbl 33 `"33"', add

. label define agelbl 34 `"34"', add

. label define agelbl 35 `"35"', add

. label define agelbl 36 `"36"', add

. label define agelbl 37 `"37"', add

. label define agelbl 38 `"38"', add

. label define agelbl 39 `"39"', add

. label define agelbl 40 `"40"', add

. label define agelbl 41 `"41"', add

. label define agelbl 42 `"42"', add

. label define agelbl 43 `"43"', add

. label define agelbl 44 `"44"', add

. label define agelbl 45 `"45"', add

. label define agelbl 46 `"46"', add

. label define agelbl 47 `"47"', add

. label define agelbl 48 `"48"', add

. label define agelbl 49 `"49"', add

. label define agelbl 50 `"50"', add

. label define agelbl 51 `"51"', add

. label define agelbl 52 `"52"', add

. label define agelbl 53 `"53"', add

. label define agelbl 54 `"54"', add

. label define agelbl 55 `"55"', add

. label define agelbl 56 `"56"', add

. label define agelbl 57 `"57"', add

. label define agelbl 58 `"58"', add

. label define agelbl 59 `"59"', add

. label define agelbl 60 `"60"', add

. label define agelbl 61 `"61"', add

. label define agelbl 62 `"62"', add

. label define agelbl 63 `"63"', add

. label define agelbl 64 `"64"', add

. label define agelbl 65 `"65"', add

. label define agelbl 66 `"66"', add

. label define agelbl 67 `"67"', add

. label define agelbl 68 `"68"', add

. label define agelbl 69 `"69"', add

. label define agelbl 70 `"70"', add

. label define agelbl 71 `"71"', add

. label define agelbl 72 `"72"', add

. label define agelbl 73 `"73"', add

. label define agelbl 74 `"74"', add

. label define agelbl 75 `"75"', add

. label define agelbl 76 `"76"', add

. label define agelbl 77 `"77"', add

. label define agelbl 78 `"78"', add

. label define agelbl 79 `"79"', add

. label define agelbl 80 `"80"', add

. label define agelbl 81 `"81"', add

. label define agelbl 82 `"82"', add

. label define agelbl 83 `"83"', add

. label define agelbl 84 `"84"', add

. label define agelbl 85 `"85"', add

. label define agelbl 86 `"86"', add

. label define agelbl 87 `"87"', add

. label define agelbl 88 `"88"', add

. label define agelbl 89 `"89"', add

. label define agelbl 90 `"90 (90+, 1988-2002)"', add

. label define agelbl 91 `"91"', add

. label define agelbl 92 `"92"', add

. label define agelbl 93 `"93"', add

. label define agelbl 94 `"94"', add

. label define agelbl 95 `"95"', add

. label define agelbl 96 `"96"', add

. label define agelbl 97 `"97"', add

. label define agelbl 98 `"98"', add

. label define agelbl 99 `"99+"', add

. label values age agelbl

.

. label define sexlbl 1 `"Male"'

. label define sexlbl 2 `"Female"', add

. label values sex sexlbl

.

.

end of do-file

. log close

name:  <unnamed>

log:  C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web p

> ages\soc_meth_proj3\fall_2011_381_logs\class2.log

log type:  text

closed on:  29 Sep 2011, 15:30:54

---------------------------------------------------------------------------------