-----------------------------------------------------------------------------------

       log:  C:\AAA Miker Files\newer web pages\soc_meth_proj3\class8_2009.log

  log type:  text

 opened on:  19 Feb 2009, 11:27:09

 

. set mem 200m

 

Current memory allocation

 

                    current                                 memory usage

    settable          value     description                 (1M = 1024k)

    --------------------------------------------------------------------

    set maxvar         5000     max. variables allowed           1.909M

    set memory          200M    max. data space                200.000M

    set matsize         400     max. RHS vars in models          1.254M

                                                            -----------

                                                               203.163M

 

. use "C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta", clear

 

. gen byte nurses=0

 

. replace nurses=1 if occ1990==95

(966 real changes made)

 

. gen byte sociologists=0

 

. replace sociologists=1 if occ1990==125

(6 real changes made)

 

. regress incwage lawyers if occ1990==178| occ1990==95

 

      Source |       SS       df       MS              Number of obs =    1407

-------------+------------------------------           F(  1,  1405) =  221.72

       Model |  4.0354e+11     1  4.0354e+11           Prob > F      =  0.0000

    Residual |  2.5571e+12  1405  1.8200e+09           R-squared     =  0.1363

-------------+------------------------------           Adj R-squared =  0.1357

       Total |  2.9607e+12  1406  2.1057e+09           Root MSE      =   42662

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   36507.47   2451.758    14.89   0.000     31697.97    41316.97

       _cons |   37536.85   1372.618    27.35   0.000     34844.25    40229.45

------------------------------------------------------------------------------

 

. ttest incwage if occ1990==178 | occ1990==95, by(occ1990)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

Register |     966    37536.85    702.6892    21839.96    36157.88    38915.83

 Lawyers |     441    74044.33    3287.284    69032.96     67583.6    80505.06

---------+--------------------------------------------------------------------

combined |    1407    48979.49    1223.363    45888.34    46579.68    51379.31

---------+--------------------------------------------------------------------

    diff |           -36507.47    2451.758               -41316.97   -31697.97

------------------------------------------------------------------------------

    diff = mean(Register) - mean(Lawyers)                         t = -14.8903

Ho: diff = 0                                     degrees of freedom =     1405

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

*Note that two sample ttest with equal variance assumption, and regression on two samples yields the same coefficient and T-statistic.

 

 

. regress incwage lawyers sociologists if occ1990==178| occ1990==95 | occ1990==125

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  2,  1410) =  111.34

       Model |  4.0387e+11     2  2.0194e+11           Prob > F      =  0.0000

    Residual |  2.5574e+12  1410  1.8137e+09           R-squared     =  0.1364

-------------+------------------------------           Adj R-squared =  0.1352

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42588

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   36507.47   2447.523    14.92   0.000      31706.3    41308.65

sociologists |   3971.481    17440.4     0.23   0.820    -30240.45    38183.41

       _cons |   37536.85   1370.247    27.39   0.000     34848.91    40224.79

------------------------------------------------------------------------------

 

. regress incwage  sociologists if occ1990==178| occ1990==125

 

      Source |       SS       df       MS              Number of obs =     447

-------------+------------------------------           F(  1,   445) =    1.33

       Model |  6.2663e+09     1  6.2663e+09           Prob > F      =  0.2495

    Residual |  2.0971e+12   445  4.7125e+09           R-squared     =  0.0030

-------------+------------------------------           Adj R-squared =  0.0007

       Total |  2.1034e+12   446  4.7160e+09           Root MSE      =   68648

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

sociologists |  -32535.99   28215.44    -1.15   0.249    -87988.05    22916.07

       _cons |   74044.33   3268.953    22.65   0.000     67619.82    80468.83

------------------------------------------------------------------------------

 

. regress incwage  nurses sociologists if occ1990==178| occ1990==125| occ1990==95

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  2,  1410) =  111.34

       Model |  4.0387e+11     2  2.0194e+11           Prob > F      =  0.0000

    Residual |  2.5574e+12  1410  1.8137e+09           R-squared     =  0.1364

-------------+------------------------------           Adj R-squared =  0.1352

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42588

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

      nurses |  -36507.47   2447.523   -14.92   0.000    -41308.65    -31706.3

sociologists |  -32535.99   17504.37    -1.86   0.063     -66873.4    1801.409

       _cons |   74044.33   2028.001    36.51   0.000      70066.1    78022.55

------------------------------------------------------------------------------

 

*And note that the T-statistic (but not the beta) for lawyer-sociologist comparison depends on the presence of the nurses in the model, because regression pools the variance of the subsamples...

 

. regress incwage lawyers sociologists if occ1990==178| occ1990==95 | occ1990==125

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  2,  1410) =  111.34

       Model |  4.0387e+11     2  2.0194e+11           Prob > F      =  0.0000

    Residual |  2.5574e+12  1410  1.8137e+09           R-squared     =  0.1364

-------------+------------------------------           Adj R-squared =  0.1352

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42588

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   36507.47   2447.523    14.92   0.000      31706.3    41308.65

sociologists |   3971.481    17440.4     0.23   0.820    -30240.45    38183.41

       _cons |   37536.85   1370.247    27.39   0.000     34848.91    40224.79

------------------------------------------------------------------------------

 

. *What about a change of scale?

 

. gen incwage2=incwage*2

(30484 missing values generated)

 

. regress incwage2 lawyers sociologists if occ1990==178| occ1990==95 | occ1990==125

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  2,  1410) =  111.34

       Model |  1.6155e+12     2  8.0774e+11           Prob > F      =  0.0000

    Residual |  1.0229e+13  1410  7.2550e+09           R-squared     =  0.1364

-------------+------------------------------           Adj R-squared =  0.1352

       Total |  1.1845e+13  1412  8.3888e+09           Root MSE      =   85176

 

------------------------------------------------------------------------------

    incwage2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   73014.95   4895.045    14.92   0.000     63412.59     82617.3

sociologists |   7942.963    34880.8     0.23   0.820    -60480.89    76366.82

       _cons |    75073.7   2740.495    27.39   0.000     69697.82    80449.59

------------------------------------------------------------------------------

 

. *T-statistic is nicely unit free

 

. regress incwage lawyers sociologists if occ1990==178| occ1990==95 | occ1990==125

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  2,  1410) =  111.34

       Model |  4.0387e+11     2  2.0194e+11           Prob > F      =  0.0000

    Residual |  2.5574e+12  1410  1.8137e+09           R-squared     =  0.1364

-------------+------------------------------           Adj R-squared =  0.1352

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42588

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   36507.47   2447.523    14.92   0.000      31706.3    41308.65

sociologists |   3971.481    17440.4     0.23   0.820    -30240.45    38183.41

       _cons |   37536.85   1370.247    27.39   0.000     34848.91    40224.79

------------------------------------------------------------------------------

 

. lincom lawyers-sociologists

 

 ( 1)  lawyers - sociologists = 0

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

         (1) |   32535.99   17504.37     1.86   0.063    -1801.409     66873.4

------------------------------------------------------------------------------

 

*After an arbitrary change of the excluded occupational category among these 3, we can still recover the same Betas and T-statistics with a simple lincom.

 

. regress incwage lawyers nurses if occ1990==178| occ1990==95 | occ1990==125

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  2,  1410) =  111.34

       Model |  4.0387e+11     2  2.0194e+11           Prob > F      =  0.0000

    Residual |  2.5574e+12  1410  1.8137e+09           R-squared     =  0.1364

-------------+------------------------------           Adj R-squared =  0.1352

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42588

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   32535.99   17504.37     1.86   0.063    -1801.409     66873.4

      nurses |  -3971.481    17440.4    -0.23   0.820    -38183.41    30240.45

       _cons |   41508.33   17386.49     2.39   0.017     7402.162     75614.5

------------------------------------------------------------------------------

 

. *lincom is a post-regression function. You can't run lincom without running the regression first. So lincom just gives you what the regression would give you if you made some different choices about excluded categories.

 

. regress incwage lawyers sociologists if occ1990==178| occ1990==95 | occ1990==125

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  2,  1410) =  111.34

       Model |  4.0387e+11     2  2.0194e+11           Prob > F      =  0.0000

    Residual |  2.5574e+12  1410  1.8137e+09           R-squared     =  0.1364

-------------+------------------------------           Adj R-squared =  0.1352

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42588

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   36507.47   2447.523    14.92   0.000      31706.3    41308.65

sociologists |   3971.481    17440.4     0.23   0.820    -30240.45    38183.41

       _cons |   37536.85   1370.247    27.39   0.000     34848.91    40224.79

------------------------------------------------------------------------------

 

. regress incwage lawyers sociologists female if occ1990==178| occ1990==95 | occ1990==125

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  3,  1409) =   83.70

       Model |  4.4789e+11     3  1.4930e+11           Prob > F      =  0.0000

    Residual |  2.5134e+12  1409  1.7838e+09           R-squared     =  0.1512

-------------+------------------------------           Adj R-squared =  0.1494

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42235

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   25723.53   3256.452     7.90   0.000     19335.51    32111.55

sociologists |  -604.9475   17320.32    -0.03   0.972    -34581.34    33371.45

      female |  -17003.19   3422.971    -4.97   0.000    -23717.86   -10288.53

       _cons |   53448.74   3479.591    15.36   0.000     46623.01    60274.48

------------------------------------------------------------------------------

 

. *What we do by adding gender, is we calculate the other differences net of gender. So, since nursing is more female and the percentage of lawyers who are men is higher thant he percentage of nurses who are men, accounting for gender reduces the lawyer-nurse difference in the regression from $35K to $25K.

 

. gen byte male=0

 

. replace male=1 if sex==1

(64791 real changes made)

 

. regress incwage lawyers sociologists male if occ1990==178| occ1990==95 | occ1990==125

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  3,  1409) =   83.70

       Model |  4.4789e+11     3  1.4930e+11           Prob > F      =  0.0000

    Residual |  2.5134e+12  1409  1.7838e+09           R-squared     =  0.1512

-------------+------------------------------           Adj R-squared =  0.1494

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42235

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   25723.53   3256.452     7.90   0.000     19335.51    32111.55

sociologists |  -604.9475   17320.32    -0.03   0.972    -34581.34    33371.45

        male |   17003.19   3422.971     4.97   0.000     10288.53    23717.86

       _cons |   36445.55   1376.531    26.48   0.000     33745.28    39145.82

------------------------------------------------------------------------------

 

. *here, the constant refers to the income of female nurses. In fact it is the mean PREDICTED value of income for female nurses... Also note that it makes no difference to the lawyer-nurse comparison whether the dummy for gender is male or female.

 

. predict Model_class8

(option xb assumed; fitted values)

 

. table occ1990 sex if occ1990==178|occ1990==95 | occ1990==125, contents(freq mean incwage mean  Model_class8)

 

------------------------------------------------

Occupation, 1990      |           Sex          

basis                 |        Male       Female

----------------------+-------------------------

    Registered nurses |          62          904

                      | 48602.45161   36777.9281

                      |    53448.74     36445.55

                      |

Sociology instructors |           2            4

                      |       39200      42662.5

                      |     52843.8      35840.6

                      |

              Lawyers |         308          133

                      | 80236.42208  59704.73684

                      |    79172.27     62169.08

------------------------------------------------

 

. *A few things to note here: First, the 36445 is the constant in the model, it is also the mean predicted value for the category that is excluded by all the dummy variables in the model, namely female nurses. Second, take note of the fact that the predicted and actual mean income values are not the same here. In general, the predicted and actual values will not be the same. Third, note the fact that the $17K gender income gap is the same across all 3 occupational categories in the predicted values- this is a feature (or a problem) of regression cal

led linearity. The real data is not linear in this way (ie the real gender gap varies a lot by occupation).

 

. save "C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta", replace

file C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta saved

 

. exit, clear