-----------------------------------------------------------------------------------------------------

      name:  <unnamed>

       log:  C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class5.log

  log type:  text

 opened on:   8 Oct 2018, 09:45:16

 

. use "C:\Users\mexmi\Desktop\cps_mar_2000_new.dta", clear

 

 

. *class starts here

 

. ttest yrsed if age>=25 & age<=34, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

. ttest yrsed if age>=25 & age<=34, by(sex) unequal

 

Two-sample t test with unequal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0428057                 -.32835   -.1605438

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7106

Ho: diff = 0                     Satterthwaite's degrees of freedom =  18383.6

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

·        Note the subtle difference in std error of the difference between the equal variance and unequal variance t-test. The difference is subtle because the variance men’s education and the variance of women’s education is almost the same, so it hardly matters if we assume equal variance or not. The difference between the means is of course identical in both cases.

 

. display -.2444469/.0428057

-5.7106156

 

. display ttail(18536,-5.7164)

.99999999

 

. display (1-ttail(18356,-5.7164))

5.525e-09

 

. display 2*(1-ttail(18356,-5.7164))

1.105e-08

 

. codebook sex

 

-----------------------------------------------------------------------------------------------------

sex                                                                                               Sex

-----------------------------------------------------------------------------------------------------

 

                  type:  numeric (byte)

                 label:  sexlbl

 

                 range:  [1,2]                        units:  1

         unique values:  2                        missing .:  0/133710

 

            tabulation:  Freq.   Numeric  Label

                         64791         1  Male

                         68919         2  Female

 

. gen byte female=0

 

. replace female=1 if sex==2

(68919 real changes made)

 

. tabulate sex female

 

           |        female

       Sex |         0          1 |     Total

-----------+----------------------+----------

      Male |    64,791          0 |    64,791

    Female |         0     68,919 |    68,919

-----------+----------------------+----------

     Total |    64,791     68,919 |   133,710

 

·        In order to use sex in a regression, we need to construct a 0-1 dummy variable for it, to take the place of the 1-2 values in the variable.

 

. regress yrsed female if age>=25 & age<=34

 

      Source |       SS       df       MS              Number of obs =   18538

-------------+------------------------------           F(  1, 18536) =   32.68

       Model |  276.742433     1  276.742433           Prob > F      =  0.0000

    Residual |  156979.922 18536  8.46892111           R-squared     =  0.0018

-------------+------------------------------           Adj R-squared =  0.0017

       Total |  157256.664 18537  8.48339343           Root MSE      =  2.9101

 

------------------------------------------------------------------------------

       yrsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

      female |   .2444469   .0427623     5.72   0.000     .1606289    .3282649

       _cons |   13.31212   .0306297   434.62   0.000     13.25208    13.37216

 

·        Note that the constant term has a t-test associated with it, but not a very interesting one. The null hypothesis for the constant term is that men’s average educational attainment is zero, which cannot be true.

·        If we want the SD, SE, t-stat or other things to more precision, Stata stores them and we can call them up.

 

. matrix var_covar_regress=e(V)

 

. matrix list var_covar_regress

 

symmetric var_covar_regress[2,2]

            female       _cons

female   .00182861

 _cons  -.00093818   .00093818

 

. display var_covar_regress[1,1]^0.5

.04276226

 

. display 0.2444469/0.04276226

5.7164168

 

·        Fun with boxplot! Look up the definitions of box plot in the stata documentation, to see what the parts of the boxes mean. You can ignore the outliers.

 

. graph box age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)

 

. graph hbox age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)

 

·        In order to get the percentiles of each distribution, you can use summarize, detail or table.

 

. summarize age if occ1990==178, detail

 

                             Age

-------------------------------------------------------------

      Percentiles      Smallest

 1%           24             24

 5%           27             24

10%           29             24       Obs                 441

25%           35             24       Sum of Wgt.         441

 

50%           43                      Mean           44.38549

                        Largest       Std. Dev.      12.48585

75%           52             84

90%           61             86       Variance       155.8965

95%           66             87       Skewness       .7190904

99%           83             90       Kurtosis       3.549932

 

. table occ1990 if occ1990==178| occ1990==95 | occ1990==125, contents(freq p25 age p50 age p75 age)

 

----------------------------------------------------------------------

Occupation, 1990      |

basis                 |      Freq.    p25(age)    med(age)    p75(age)

----------------------+-----------------------------------------------

    Registered nurses |        966          36          43          51

Sociology instructors |          6          50          53          54

              Lawyers |        441          35          43          52

----------------------------------------------------------------------

 

. ttest yrsed if age>=25 & age<=34, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

. gen months_ed=yrsed*12

(30484 missing values generated)

 

·        Rescaling the variable changes the means, SDs and SEs, but not the t-stat.

 

. ttest months_ed if age>=25 & age<=34, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    159.7454    .3748215    35.61199    159.0107    160.4802

  Female |    9511    162.6788    .3512319    34.25366    161.9903    163.3673

---------+--------------------------------------------------------------------

combined |   18538    161.2504    .2567052    34.95152    160.7472    161.7536

---------+--------------------------------------------------------------------

    diff |           -2.933363    .5131471               -3.939178   -1.927547

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

. log close

      name:  <unnamed>

       log:  C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class5.log

  log type:  text

 closed on:   8 Oct 2018, 12:23:18

-----------------------------------------------------------------------------------------------------