--------------------------------------------------------------------------------------------

      name:  <unnamed>

       log:  C:\Users\Michael\Documents\newer web pages\soc_meth_proj3\fall_2014_logs\class8

> .log

  log type:  text

 opened on:  15 Oct 2014, 10:49:13

 

. use "C:\Users\Michael\Documents\current class files\intro soc methods\cps_mar_2000_new with additional vars.dta", clear

 

 

. summarize incwage if lawyers==1, detail

 

                   Wage and salary income

-------------------------------------------------------------

      Percentiles      Smallest

 1%            0              0

 5%            0              0

10%            0              0       Obs                 441

25%        17000              0       Sum of Wgt.         441

 

50%        61000                      Mean           74044.33

                        Largest       Std. Dev.      69032.96

75%       100960         279376

90%       197387         279376       Variance       4.77e+09

95%       229339         279376       Skewness       1.132374

99%       257525         364302       Kurtosis       3.973892

 

. summarize incwage if nurses==1, detail

 

                   Wage and salary income

-------------------------------------------------------------

      Percentiles      Smallest

 1%            0              0

 5%         6500              0

10%        12000              0       Obs                 966

25%        25000              0       Sum of Wgt.         966

 

50%        37000                      Mean           37536.85

                        Largest       Std. Dev.      21839.96

75%        48000         100000

90%        61000         132000       Variance       4.77e+08

95%        70000         229339       Skewness       3.506697

99%        89468         333564       Kurtosis       43.18005

 

* Note that there is more to the distributions than the 25th and 75th percentiles. Lawyers are more likely to have zero earning also.

 

*Back to our equal variance t-tests from HW2

 

. ttest incwage if lawyers==1 | sociologists ==1, by(occ1990)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

Sociolog |       6    41508.33    2842.722    6963.219    34200.88    48815.78

 Lawyers |     441    74044.33    3287.284    69032.96     67583.6    80505.06

---------+--------------------------------------------------------------------

combined |     447     73607.6    3248.139    68673.38    67224.04    79991.16

---------+--------------------------------------------------------------------

    diff |           -32535.99    28215.44               -87988.05    22916.07

------------------------------------------------------------------------------

    diff = mean(Sociolog) - mean(Lawyers)                         t =  -1.1531

Ho: diff = 0                                     degrees of freedom =      445

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.1247         Pr(|T| > |t|) = 0.2495          Pr(T > t) = 0.8753

 

. regress incwage lawyers if lawyers==1 | sociologists==1

 

      Source |       SS       df       MS              Number of obs =     447

-------------+------------------------------           F(  1,   445) =    1.33

       Model |  6.2663e+09     1  6.2663e+09           Prob > F      =  0.2495

    Residual |  2.0971e+12   445  4.7125e+09           R-squared     =  0.0030

-------------+------------------------------           Adj R-squared =  0.0007

       Total |  2.1034e+12   446  4.7160e+09           Root MSE      =   68648

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   32535.99   28215.44     1.15   0.249    -22916.07    87988.05

       _cons |   41508.33   28025.43     1.48   0.139    -13570.31    96586.97

------------------------------------------------------------------------------

 

* The regression appears to give us the same t-test, but how can we be sure when the regression only reports the t-stat to 3 digits (1.15)? One way is to look at the coefficient and the SE that together comprise the t-test: they are exactly the same to 7 digits. Another way is to recover the t-statistic in full from the regression, which requires pulling out the coefficient and variance-covariance matrices.

 

*First, asking stata to store the coefficients and the variance covariance matrices in two local matrix variables that we name betas and VCM.

 

. matrix betas=e(b)

 

. matrix VCM=e(V)

 

. matrix list betas

 

betas[1,2]

      lawyers      _cons

y1  32535.993  41508.333

 

. matrix list VCM

 

symmetric VCM[2,2]

            lawyers       _cons

lawyers   7.961e+08

  _cons  -7.854e+08   7.854e+08

 

 

. display 32535.993/((7.961e+08)^.5)

1.1531353

 

* If we want more accuracy, we can rely not on the printed version of the VCM, but on the stored version (calling on Stata to use the [1,1] element of the VCM matrix:

 

display 32535.993/((VCM[1,1])^.5)

1.1531274

 

 

 

. regress incwage lawyers if lawyers==1 | sociologists==1

 

      Source |       SS       df       MS              Number of obs =     447

-------------+------------------------------           F(  1,   445) =    1.33

       Model |  6.2663e+09     1  6.2663e+09           Prob > F      =  0.2495

    Residual |  2.0971e+12   445  4.7125e+09           R-squared     =  0.0030

-------------+------------------------------           Adj R-squared =  0.0007

       Total |  2.1034e+12   446  4.7160e+09           Root MSE      =   68648

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   32535.99   28215.44     1.15   0.249    -22916.07    87988.05

       _cons |   41508.33   28025.43     1.48   0.139    -13570.31    96586.97

------------------------------------------------------------------------------

 

* In this next case, we change the comparison category to sociologists and nurses, so the coefficient and its standard error will be different.

 

. regress incwage lawyers if lawyers==1 | sociologists==1 |nurses==1

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  1,  1411) =  222.77

       Model |  4.0378e+11     1  4.0378e+11           Prob > F      =  0.0000

    Residual |  2.5575e+12  1411  1.8125e+09           R-squared     =  0.1364

-------------+------------------------------           Adj R-squared =  0.1357

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42574

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   36482.96   2444.332    14.93   0.000     31688.04    41277.88

       _cons |   37561.37   1365.553    27.51   0.000     34882.64     40240.1

------------------------------------------------------------------------------

 

* In this next case, we compare the income of lawyers to the incomes of everyone else (who has income), so again, the coefficient and its standard error changes.

 

. regress incwage lawyers

 

      Source |       SS       df       MS              Number of obs =  103226

-------------+------------------------------           F(  1,103224) = 1610.72

       Model |  1.3194e+12     1  1.3194e+12           Prob > F      =  0.0000

    Residual |  8.4558e+13103224   819166021           R-squared     =  0.0154

-------------+------------------------------           Adj R-squared =  0.0154

       Total |  8.5877e+13103225   831940347           Root MSE      =   28621

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   54815.92   1365.829    40.13   0.000     52138.91    57492.92

       _cons |   19228.41    89.2732   215.39   0.000     19053.43    19403.38

------------------------------------------------------------------------------

 

* Now back to our first example, the two-sample regression that gives results exactly like our two sample t-test:

 

. regress incwage lawyers if lawyers==1 | sociologists==1

 

      Source |       SS       df       MS              Number of obs =     447

-------------+------------------------------           F(  1,   445) =    1.33

       Model |  6.2663e+09     1  6.2663e+09           Prob > F      =  0.2495

    Residual |  2.0971e+12   445  4.7125e+09           R-squared     =  0.0030

-------------+------------------------------           Adj R-squared =  0.0007

       Total |  2.1034e+12   446  4.7160e+09           Root MSE      =   68648

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   32535.99   28215.44     1.15   0.249    -22916.07    87988.05

       _cons |   41508.33   28025.43     1.48   0.139    -13570.31    96586.97

------------------------------------------------------------------------------

 

*What if we add a 3rd group to the model, yet maintain sociologists as the comparison case?

 

. regress incwage lawyers nurses if lawyers==1 | sociologists==1 |nurses==1

 

      Source |       SS       df       MS              Number of obs =    1413

-------------+------------------------------           F(  2,  1410) =  111.34

       Model |  4.0387e+11     2  2.0194e+11           Prob > F      =  0.0000

    Residual |  2.5574e+12  1410  1.8137e+09           R-squared     =  0.1364

-------------+------------------------------           Adj R-squared =  0.1352

       Total |  2.9612e+12  1412  2.0972e+09           Root MSE      =   42588

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   32535.99   17504.37     1.86   0.063    -1801.409     66873.4

      nurses |  -3971.481    17440.4    -0.23   0.820    -38183.41    30240.45

       _cons |   41508.33   17386.49     2.39   0.017     7402.162     75614.5

------------------------------------------------------------------------------

 

* We get the same coefficient as before (for the lawyer-sociologist comparison), but the standard error changes because the presence of nurses changes the overall variance of income, which is derived from everyone in the sample.

 

. log close

      name:  <unnamed>

       log:  C:\Users\Michael\Documents\newer web pages\soc_meth_proj3\fall_2014_logs\cla

> ss8.log

  log type:  text

 closed on:  15 Oct 2014, 12:42:25

-----------------------------------------------------------------------------------------