-----------------------------------------------------------------------------------

       log:  C:\AAA Miker Files\newer web pages\soc_meth_proj3\class6_2009.log

  log type:  text

 opened on:  12 Feb 2009, 11:15:35

 

. set mem 200m

 

Current memory allocation

 

                    current                                 memory usage

    settable          value     description                 (1M = 1024k)

    --------------------------------------------------------------------

    set maxvar         5000     max. variables allowed           1.909M

    set memory          200M    max. data space                200.000M

    set matsize         400     max. RHS vars in models          1.254M

                                                            -----------

                                                               203.163M

 

. use "C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta", clear

 

. table occ1990 if occ1990==178|occ1990==95, contents(mean incwage sd incwage freq)

 

---------------------------------------------------------------

Occupation, 1990  |

basis             | mean(incwage)    sd(incwage)          Freq.

------------------+--------------------------------------------

Registered nurses |   37536.85197       21839.96            966

          Lawyers |   74044.32653       69032.96            441

---------------------------------------------------------------

 

. ttest incwage if occ1990==178|occ1990==95, by(occ1990)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

Register |     966    37536.85    702.6892    21839.96    36157.88    38915.83

 Lawyers |     441    74044.33    3287.284    69032.96     67583.6    80505.06

---------+--------------------------------------------------------------------

combined |    1407    48979.49    1223.363    45888.34    46579.68    51379.31

---------+--------------------------------------------------------------------

    diff |           -36507.47    2451.758               -41316.97   -31697.97

------------------------------------------------------------------------------

    diff = mean(Register) - mean(Lawyers)                         t = -14.8903

Ho: diff = 0                                     degrees of freedom =     1405

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

. ttest incwage if occ1990==178|occ1990==95, by(occ1990) unequal

 

Two-sample t test with unequal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

Register |     966    37536.85    702.6892    21839.96    36157.88    38915.83

 Lawyers |     441    74044.33    3287.284    69032.96     67583.6    80505.06

---------+--------------------------------------------------------------------

combined |    1407    48979.49    1223.363    45888.34    46579.68    51379.31

---------+--------------------------------------------------------------------

    diff |           -36507.47    3361.548               -43112.62   -29902.33

------------------------------------------------------------------------------

    diff = mean(Register) - mean(Lawyers)                         t = -10.8603

Ho: diff = 0                     Satterthwaite's degrees of freedom =  480.671

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

. *The unequal t-statistic is the formula we used before. The equal variance t-statistic is the default assumption, and it is also the t-statistic generated by standard regressions.

 

. gen byte lawyers=0

 

. replace lawyers=1 if occ1990==178

(441 real changes made)

 

. regress incwage lawyers if occ1990==178|occ1990==95

 

      Source |       SS       df       MS              Number of obs =    1407

-------------+------------------------------           F(  1,  1405) =  221.72

       Model |  4.0354e+11     1  4.0354e+11           Prob > F      =  0.0000

    Residual |  2.5571e+12  1405  1.8200e+09           R-squared     =  0.1363

-------------+------------------------------           Adj R-squared =  0.1357

       Total |  2.9607e+12  1406  2.1057e+09           Root MSE      =   42662

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

     lawyers |   36507.47   2451.758    14.89   0.000     31697.97    41316.97

       _cons |   37536.85   1372.618    27.35   0.000     34844.25    40229.45

------------------------------------------------------------------------------

 

. *note that standard OLS regression gives us the equal variance t-statistic, 14.89

 

. display ttail(25,2.49)

.00989098

 

. *Just as with Freedman's table, this gives us 1% tail. But we generally want two-tailed tests, so we double it to 2%

 

. display 2*ttail(25,2.49)

.01978195

 

. display 2*ttail(1400,2.49)

.01288957

 

. *Instead of 2% we would have a 2-tail probability of 1.3% if we had N of 1400 instead of N of 25

 

. display 2*ttail(1400,14.89)

1.167e-46

 

. display normal(1.96)

.9750021

 

. *the normal function gives you the cumulative normal density up to the value, leaving one tail. The one-tail probability for Z-score of 1.96 is 2.5%, the two-tail probability is 5%

 

. display 2*(1-normal(1.96))

.04999579

 

. save "C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta", replace

file C:\AAA Miker Files\newer web pages\soc_meth_proj3\cps_mar_2000_new.dta saved

 

. *how does the two tail probability of the normal differ from the two tail probability of the T-statistic? It depends on N. Check Freedman’s table...

 

. exit, clear