--------------------------------------------------------

 

. *Class starts here

 

. ttest yrsed if age>=25 & age<=34, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |   9,027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |   9,511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |  18,538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

* This above is the t-test we ended last class with. The difference between the mean educational attainment of men and women is a modest difference, 0.244 years. How likely would be to get a difference of this size or greater, in either direction, in a sample size this large under the null hypothesis that men and women in the US in this age group had the same average educational attainment? The task is to put a probability to the T-statistic of -5.7.

 

. display 1- ttail(18536, -5.7164)

5.524e-09

 

* If you look up ttail in the Stata online help, it will tell you that ttest gives the cumulative probability from t to infinity. We do 1-above because we want the other side, the probability from minus infinity to t.

 

. display  ttail(18536, 5.7164)

5.524e-09

 

* Which is equivalent of the cumulative probability of t to infinity.

 

. display  2*ttail(18536, 5.7164)

1.105e-08

 

* Which we double to get the two tailed test, because we are interested in the probability of extreme differences in either direction (men’s ed higher than women by the same amount, or women higher than men). Two tailed tests are standard. The value of 1-in-100 million is a tiny value that leads us to reject the null hypothesis.

 

*Onto another issue: welfare and its average across all people, and across welfare recipients.

 

 

. summarize incwelfr

 

    Variable |        Obs        Mean    Std. Dev.       Min        Max

-------------+---------------------------------------------------------

    incwelfr |    103,226    40.62242    478.8231          0      25000

 

*Note the obs<133K, because not everyone is in the universe to be asked the question. Look the variable up in ipums to figure out who.

 

 

. summarize incwelfr if age>=15 & incwelfr>0 & incwelfr~=.

 

    Variable |        Obs        Mean    Std. Dev.       Min        Max

-------------+---------------------------------------------------------

    incwelfr |      1,289    3253.134    2813.505          1      25000

 

. summarize incwelfr if age>=15 & incwelfr>0 & incwelfr~=. [fweight= perwt_rounded]

 

    Variable |        Obs        Mean    Std. Dev.       Min        Max

-------------+---------------------------------------------------------

    incwelfr |  2,551,246    3072.095    2803.442          1      25000

 

. summarize incwelfr if incwelfr>0 & incwelfr~=. [fweight= perwt_rounded]

 

    Variable |        Obs        Mean    Std. Dev.       Min        Max

-------------+---------------------------------------------------------

    incwelfr |  2,551,246    3072.095    2803.442          1      25000

 

Some syntax for creation of a new variable

 

. gen byte receives_welfare=0

 

. replace receives_welfare=1 if incwelfr>0 & incwelfr~=.

(1,289 real changes made)

 

. label define receives_welfare_lbl 0 "no welfare" 1 "yes welfare"

* creation of a value label.

 

 

. label val receives_welfare receives_welfare_lbl

*Associating the value label with the variable.

 

. tabulate receives_welfare

 

receives_we |

      lfare |      Freq.     Percent        Cum.

------------+-----------------------------------

 no welfare |    132,421       99.04       99.04

yes welfare |      1,289        0.96      100.00

------------+-----------------------------------

      Total |    133,710      100.00

 

. label var receives_welfare "does respondent receive welfare"

* labeling the variable itself.

 

. tabulate receives_welfare

 

       does |

 respondent |

    receive |

    welfare |      Freq.     Percent        Cum.

------------+-----------------------------------

 no welfare |    132,421       99.04       99.04

yes welfare |      1,289        0.96      100.00

------------+-----------------------------------

      Total |    133,710      100.00

 

. sort receives_welfare

 

. by receives_welfare: summarize yrsed if age>=22

 

------------------------------------------------------------------------------------------

-> receives_welfare = no welfare

 

    Variable |        Obs        Mean    Std. Dev.       Min        Max

-------------+---------------------------------------------------------

       yrsed |     89,071    13.08496    3.148834          0         17

 

------------------------------------------------------------------------------------------

-> receives_welfare = yes welfare

 

    Variable |        Obs        Mean    Std. Dev.       Min        Max

-------------+---------------------------------------------------------

       yrsed |      1,064    11.07989    2.947759          0         17

 

 

. table receives_welfare sex [fweight= perwt_rounded] if age>=22, contents(freq mean age mean yrsed mean incwage) row col

 

------------------------------------------------------------------------

does        |

respondent  |

receive     |                            Sex                           

welfare     |               Male              Female               Total

------------+-----------------------------------------------------------

 no welfare |           8.87e+07            9.53e+07            1.84e+08

            | 46.236953735351563  48.036090850830078  47.168895721435547

            |           13.26548            13.16087            13.21129

            |        30115.31029         15561.13725         22576.34082

            |

yes welfare |            247,464             1788806             2036270

            |  42.56903076171875  35.875446319580078  36.688907623291016

            |           11.04794            11.20504            11.18595

            |        5285.664169         3739.413383         3927.326285

            |

      Total |           8.89e+07            9.71e+07            1.86e+08

            |  46.22674560546875  47.812000274658203  47.054153442382813

            |           13.25931            13.12483            13.18912

            |        30046.20359         15343.29465         22372.16191

------------------------------------------------------------------------

 

. * version 16: command

 

. or

command or is unrecognized

r(199);

 

.

. version 16

 

. *for those of you with version 17 at home, which is most of you, you might find that some commands I use won’t work the same way. Just add the version command before the problem command and the problem command should magically work. Stata has good version control. I use Stata 16 (because I am a little out of date…)

 

. format age %6.3g

*I just want to repeat the table above without so many decimals after the decimal point in age.

 

. table receives_welfare sex [fweight= perwt_rounded] if age>=22, contents(freq mean age mean yrsed mean incwage) row col

 

---------------------------------------------------

does        |

respondent  |

receive     |                  Sex                

welfare     |        Male       Female        Total

------------+--------------------------------------

 no welfare |    8.87e+07     9.53e+07     1.84e+08

            |        46.2           48         47.2

            |    13.26548     13.16087     13.21129

            | 30115.31029  15561.13725  22576.34082

            |

yes welfare |     247,464      1788806      2036270

            |        42.6         35.9         36.7

            |    11.04794     11.20504     11.18595

            | 5285.664169  3739.413383  3927.326285

            |

      Total |    8.89e+07     9.71e+07     1.86e+08

            |        46.2         47.8         47.1

            |    13.25931     13.12483     13.18912

            | 30046.20359  15343.29465  22372.16191

---------------------------------------------------

 

. display 247464/2036270

.12152809

*the display command is an interactive calculator. Use it!

 

. clear all
*I cleared the 2020 March CPS data without saving because I don’t want that extra variable, receives_welfare, that I created. But if I wanted to keep it I should have saved the data before clearing.

 

. cd "C:\Users\mexmi\Documents\current class files\intro soc methods\1995 HW1 data\2021 .dat version of the 1995 ASEC"

* We went over in class quite a bit about how to download the data, unzip it, download the .do file, put them both in the same directory, set that directory to the default director (which is what the above command does for me), and then run the do file.

 

C:\Users\mexmi\Documents\current class files\intro soc methods\1995 HW1 data\2021 .dat version of the 1995 ASEC

 

. do "C:\Users\mexmi\Documents\current class files\intro soc methods\1995 HW1 data\2021 .d

> at version of the 1995 ASEC\cps_00019.do"

 

. * NOTE: You need to set the Stata working directory to the path

. * where the data file is located.

.

. set more off

 

.

. clear

 

. quietly infix              ///

>   int     year      1-4    ///

>   long    serial    5-9    ///

>   byte    month     10-11  ///

>   double  cpsid     12-25  ///

>   byte    asecflag  26-26  ///

>   double  asecwth   27-36  ///

>   byte    pernum    37-38  ///

>   double  cpsidp    39-52  ///

>   double  asecwt    53-62  ///

>   byte    age       63-64  ///

>   byte    sex       65-65  ///

>   double  inctot    66-74  ///

>   using `"cps_00019.dat"'

 

.

. replace asecwth  = asecwth  / 10000

(149,642 real changes made)

 

. replace asecwt   = asecwt   / 10000

(149,642 real changes made)

 

.

. format cpsid    %14.0f

 

. format asecwth  %10.4f

 

. format cpsidp   %14.0f

 

. format asecwt   %10.4f

 

. format inctot   %9.0f

 

.

. label var year     `"Survey year"'

 

. label var serial   `"Household serial number"'

 

. label var month    `"Month"'

 

. label var cpsid    `"CPSID, household record"'

 

. label var asecflag `"Flag for ASEC"'

 

. label var asecwth  `"Annual Social and Economic Supplement Household weight"'

 

. label var pernum   `"Person number in sample unit"'

 

. label var cpsidp   `"CPSID, person record"'

 

. label var asecwt   `"Annual Social and Economic Supplement Weight"'

 

. label var age      `"Age"'

 

. label var sex      `"Sex"'

 

. label var inctot   `"Total personal income"'

 

.

. label define month_lbl 01 `"January"'

 

. label define month_lbl 02 `"February"', add

 

. label define month_lbl 03 `"March"', add

 

. label define month_lbl 04 `"April"', add

 

. label define month_lbl 05 `"May"', add

 

. label define month_lbl 06 `"June"', add

 

. label define month_lbl 07 `"July"', add

 

. label define month_lbl 08 `"August"', add

 

. label define month_lbl 09 `"September"', add

 

. label define month_lbl 10 `"October"', add

 

. label define month_lbl 11 `"November"', add

 

. label define month_lbl 12 `"December"', add

 

. label values month month_lbl

 

.

. label define asecflag_lbl 1 `"ASEC"'

 

. label define asecflag_lbl 2 `"March Basic"', add

 

. label values asecflag asecflag_lbl

 

.

. label define age_lbl 00 `"Under 1 year"'

 

. label define age_lbl 01 `"1"', add

 

. label define age_lbl 02 `"2"', add

 

. label define age_lbl 03 `"3"', add

 

. label define age_lbl 04 `"4"', add

 

. label define age_lbl 05 `"5"', add

 

. label define age_lbl 06 `"6"', add

 

. label define age_lbl 07 `"7"', add

 

. label define age_lbl 08 `"8"', add

 

. label define age_lbl 09 `"9"', add

 

. label define age_lbl 10 `"10"', add

 

. label define age_lbl 11 `"11"', add

 

. label define age_lbl 12 `"12"', add

 

. label define age_lbl 13 `"13"', add

 

. label define age_lbl 14 `"14"', add

 

. label define age_lbl 15 `"15"', add

 

. label define age_lbl 16 `"16"', add

 

. label define age_lbl 17 `"17"', add

 

. label define age_lbl 18 `"18"', add

 

. label define age_lbl 19 `"19"', add

 

. label define age_lbl 20 `"20"', add

 

. label define age_lbl 21 `"21"', add

 

. label define age_lbl 22 `"22"', add

 

. label define age_lbl 23 `"23"', add

 

. label define age_lbl 24 `"24"', add

 

. label define age_lbl 25 `"25"', add

 

. label define age_lbl 26 `"26"', add

 

. label define age_lbl 27 `"27"', add

 

. label define age_lbl 28 `"28"', add

 

. label define age_lbl 29 `"29"', add

 

. label define age_lbl 30 `"30"', add

 

. label define age_lbl 31 `"31"', add

 

. label define age_lbl 32 `"32"', add

 

. label define age_lbl 33 `"33"', add

 

. label define age_lbl 34 `"34"', add

 

. label define age_lbl 35 `"35"', add

 

. label define age_lbl 36 `"36"', add

 

. label define age_lbl 37 `"37"', add

 

. label define age_lbl 38 `"38"', add

 

. label define age_lbl 39 `"39"', add

 

. label define age_lbl 40 `"40"', add

 

. label define age_lbl 41 `"41"', add

 

. label define age_lbl 42 `"42"', add

 

. label define age_lbl 43 `"43"', add

 

. label define age_lbl 44 `"44"', add

 

. label define age_lbl 45 `"45"', add

 

. label define age_lbl 46 `"46"', add

 

. label define age_lbl 47 `"47"', add

 

. label define age_lbl 48 `"48"', add

 

. label define age_lbl 49 `"49"', add

 

. label define age_lbl 50 `"50"', add

 

. label define age_lbl 51 `"51"', add

 

. label define age_lbl 52 `"52"', add

 

. label define age_lbl 53 `"53"', add

 

. label define age_lbl 54 `"54"', add

 

. label define age_lbl 55 `"55"', add

 

. label define age_lbl 56 `"56"', add

 

. label define age_lbl 57 `"57"', add

 

. label define age_lbl 58 `"58"', add

 

. label define age_lbl 59 `"59"', add

 

. label define age_lbl 60 `"60"', add

 

. label define age_lbl 61 `"61"', add

 

. label define age_lbl 62 `"62"', add

 

. label define age_lbl 63 `"63"', add

 

. label define age_lbl 64 `"64"', add

 

. label define age_lbl 65 `"65"', add

 

. label define age_lbl 66 `"66"', add

 

. label define age_lbl 67 `"67"', add

 

. label define age_lbl 68 `"68"', add

 

. label define age_lbl 69 `"69"', add

 

. label define age_lbl 70 `"70"', add

 

. label define age_lbl 71 `"71"', add

 

. label define age_lbl 72 `"72"', add

 

. label define age_lbl 73 `"73"', add

 

. label define age_lbl 74 `"74"', add

 

. label define age_lbl 75 `"75"', add

 

. label define age_lbl 76 `"76"', add

 

. label define age_lbl 77 `"77"', add

 

. label define age_lbl 78 `"78"', add

 

. label define age_lbl 79 `"79"', add

 

. label define age_lbl 80 `"80"', add

 

. label define age_lbl 81 `"81"', add

 

. label define age_lbl 82 `"82"', add

 

. label define age_lbl 83 `"83"', add

 

. label define age_lbl 84 `"84"', add

 

. label define age_lbl 85 `"85"', add

 

. label define age_lbl 86 `"86"', add

 

. label define age_lbl 87 `"87"', add

 

. label define age_lbl 88 `"88"', add

 

. label define age_lbl 89 `"89"', add

 

. label define age_lbl 90 `"90 (90+, 1988-2002)"', add

 

. label define age_lbl 91 `"91"', add

 

. label define age_lbl 92 `"92"', add

 

. label define age_lbl 93 `"93"', add

 

. label define age_lbl 94 `"94"', add

 

. label define age_lbl 95 `"95"', add

 

. label define age_lbl 96 `"96"', add

 

. label define age_lbl 97 `"97"', add

 

. label define age_lbl 98 `"98"', add

 

. label define age_lbl 99 `"99+"', add

 

. label values age age_lbl

 

.

. label define sex_lbl 1 `"Male"'

 

. label define sex_lbl 2 `"Female"', add

 

. label define sex_lbl 9 `"NIU"', add

 

. label values sex sex_lbl

 

.

.

.

end of do-file

 

. *Be sure to save when you are have ingested the file

 

. save "C:\Users\mexmi\Documents\current class files\intro soc methods\1995 HW1 data\2021 .dat version of the 1995 ASEC\1995 ASEC.dta"

file C:\Users\mexmi\Documents\current class files\intro soc methods\1995 HW1 data\2021 .da

> t version of the 1995 ASEC\1995 ASEC.dta saved

 

. log close

      name:  <unnamed>

       log:  C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2021_logs\class2

> .log

  log type:  text

 closed on:  22 Sep 2021, 13:38:30

------------------------------------------------------------------------------------------