--------------------------------------------------------------------------------------------------

      name:  <unnamed>

       log:  C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\Soc180B_spr2019_logs\class6_l

> og.log

  log type:  text

 opened on:  18 Apr 2019, 14:40:54

 

. use "C:\Users\mexmi\Desktop\cps_mar_2000_new.dta", clear

 

 

 

. *class starts here

 

. ttest yrsed if age>24 & age<35, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

* above is our classic t-test. Below is the regression version of the same test. The “i.sex” part is just to tell stata that sex is a categorical variable, so not to take the actual values seriously, but to create a 0-1 dummy variable for sex, the way we do below by hand.

 

. regress yrsed i.sex if age>24 & age<35

 

      Source |       SS       df       MS              Number of obs =   18538

-------------+------------------------------           F(  1, 18536) =   32.68

       Model |  276.742433     1  276.742433           Prob > F      =  0.0000

    Residual |  156979.922 18536  8.46892111           R-squared     =  0.0018

-------------+------------------------------           Adj R-squared =  0.0017

       Total |  157256.664 18537  8.48339343           Root MSE      =  2.9101

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

         sex |

     Female  |   .2444469   .0427623     5.72   0.000     .1606289    .3282649

       _cons |   13.31212   .0306297   434.62   0.000     13.25208    13.37216

------------------------------------------------------------------------------

 

. codebook sex

 

--------------------------------------------------------------------------------------------------

sex                                                                                            Sex

--------------------------------------------------------------------------------------------------

 

                  type:  numeric (byte)

                 label:  sexlbl

 

                 range:  [1,2]                        units:  1

         unique values:  2                        missing .:  0/133710

 

            tabulation:  Freq.   Numeric  Label

                         64791         1  Male

                         68919         2  Female

 

* creating the dummy variable by hand.

 

. gen byte male=0 if sex==2

(64791 missing values generated)

 

. replace male=1 if sex==1

(64791 real changes made)

 

. tabulate sex male

 

           |         male

       Sex |         0          1 |     Total

-----------+----------------------+----------

      Male |         0     64,791 |    64,791

    Female |    68,919          0 |    68,919

-----------+----------------------+----------

     Total |    68,919     64,791 |   133,710

 

 

. regress yrsed male if age>24 & age<35

 

      Source |       SS       df       MS              Number of obs =   18538

-------------+------------------------------           F(  1, 18536) =   32.68

       Model |  276.742433     1  276.742433           Prob > F      =  0.0000

    Residual |  156979.922 18536  8.46892111           R-squared     =  0.0018

-------------+------------------------------           Adj R-squared =  0.0017

       Total |  157256.664 18537  8.48339343           Root MSE      =  2.9101

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        male |  -.2444469   .0427623    -5.72   0.000    -.3282649   -.1606289

       _cons |   13.55657   .0298401   454.31   0.000     13.49808    13.61506

------------------------------------------------------------------------------

 

 

------------------------------------------------------------------------------

 

* We did not quite get to it in class, but it is easy to add weights to a regression. Aweights gives you a reasonable answer.

 

. regress yrsed i.sex if age>24 & age<35 [aweight= perwt_rounded]

(sum of wgt is   3.7786e+07)

 

      Source |       SS       df       MS              Number of obs =   18538

-------------+------------------------------           F(  1, 18536) =   25.52

       Model |  195.741395     1  195.741395           Prob > F      =  0.0000

    Residual |  142186.809 18536  7.67084641           R-squared     =  0.0014

-------------+------------------------------           Adj R-squared =  0.0013

       Total |  142382.551 18537   7.6809921           Root MSE      =  2.7696

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

         sex |

     Female  |   .2055446   .0406899     5.05   0.000     .1257887    .2853005

       _cons |    13.5574   .0290221   467.14   0.000     13.50051    13.61429

------------------------------------------------------------------------------

 

* Fweights gives you a very unreasonable answer (in the SE an T stats), because it unreasonably inflates the sample size by a factor of 2000.

 

. regress yrsed i.sex if age>24 & age<35 [fweight= perwt_rounded]

 

      Source |       SS       df       MS              Number of obs =37785945

-------------+------------------------------           F(  1,37785943) =52018.00

       Model |  398979.047     1  398979.047           Prob > F      =  0.0000

    Residual |   28981891037785943  7.67001924           R-squared     =  0.0014

-------------+------------------------------           Adj R-squared =  0.0014

       Total |   29021788937785944  7.68057796           Root MSE      =  2.7695

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

         sex |

     Female  |   .2055446   .0009012   228.07   0.000     .2037782    .2073109

       _cons |    13.5574   .0006428  2.1e+04   0.000     13.55614    13.55866

------------------------------------------------------------------------------

 

 

. log close

      name:  <unnamed>

       log:  C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\Soc180B_spr2019_logs\class6_l

> og.log

  log type:  text

 closed on:  18 Apr 2019, 16:24:35

--------------------------------------------------------------------------------------------------