-----------------------------------------------------------------------------------------------

      name:  <unnamed>

       log:  C:\Users\Michael\Documents\newer web pages\soc_meth_proj3\fall_2016_logs\class4.lo

> g

  log type:  text

 opened on:   5 Oct 2016, 10:03:59

 

. use "C:\Users\Michael\Desktop\cps_mar_2000_new_unchanged.dta", clear

 

* How to make a box plot: See below commands and look up the Stata documentation if necessary.

 

. graph box age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)

 

. graph hbox age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)

 

* Several ways to get percentiles of distributions:

 

. summarize age if occ1990==178, detail

 

                             Age

-------------------------------------------------------------

      Percentiles      Smallest

 1%           24             24

 5%           27             24

10%           29             24       Obs                 441

25%           35             24       Sum of Wgt.         441

 

50%           43                      Mean           44.38549

                        Largest       Std. Dev.      12.48585

75%           52             84

90%           61             86       Variance       155.8965

95%           66             87       Skewness       .7190904

99%           83             90       Kurtosis       3.549932

 

. table occ1990 if occ1990==178| occ1990==95 | occ1990==125, contents(freq p25 age p50 age p75 age)

 

----------------------------------------------------------------------

Occupation, 1990      |

basis                 |      Freq.    p25(age)    med(age)    p75(age)

----------------------+-----------------------------------------------

    Registered nurses |        966          36          43          51

Sociology instructors |          6          50          53          54

              Lawyers |        441          35          43          52

----------------------------------------------------------------------

 

*Among several ways to remind your self which occupational code is which:

 

. tabulate occ1990 if occ1990==178| occ1990==95 | occ1990==125

 

                 Occupation, 1990 basis |      Freq.     Percent        Cum.

----------------------------------------+-----------------------------------

                      Registered nurses |        966       68.37       68.37

                  Sociology instructors |          6        0.42       68.79

                                Lawyers |        441       31.21      100.00

----------------------------------------+-----------------------------------

                                  Total |      1,413      100.00

 

. tabulate occ1990 if occ1990==178| occ1990==95 | occ1990==125, nolab

 

Occupation, |

 1990 basis |      Freq.     Percent        Cum.

------------+-----------------------------------

         95 |        966       68.37       68.37

        125 |          6        0.42       68.79

        178 |        441       31.21      100.00

------------+-----------------------------------

      Total |      1,413      100.00

 

 

. ttest yrsed if age>=25 & age<=34, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

* What if we rescale years of education to months of education? Would the t-test be the same? It should, because t-statistic is unit free. Notice what changes and what doesn’t change:

 

. gen months_ed=yrsed*12

(30484 missing values generated)

 

. ttest months_ed if age>=25 & age<=34, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    159.7454    .3748215    35.61199    159.0107    160.4802

  Female |    9511    162.6788    .3512319    34.25366    161.9903    163.3673

---------+--------------------------------------------------------------------

combined |   18538    161.2504    .2567052    34.95152    160.7472    161.7536

---------+--------------------------------------------------------------------

    diff |           -2.933363    .5131471               -3.939178   -1.927547

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

. exit, clear