-----------------------------------------------------------------------------------------------------

      name:  <unnamed>

       log:  C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class3.log

  log type:  text

 opened on:   1 Oct 2018, 10:12:32

 

. use "C:\Users\mexmi\Desktop\cps_mar_2000_new.dta", clear

 

 

. *class starts here

 

. table sex if age>=25 & age<=34, contents (freq mean yrsed sd yrsed semean yrsed)

 

--------------------------------------------------------------

      Sex |       Freq.  mean(yrsed)    sd(yrsed)   sem(yrsed)

----------+---------------------------------------------------

     Male |       9,027     13.31212     2.967666     .0312351

   Female |       9,511     13.55657     2.854472     .0292693

--------------------------------------------------------------

 

. display 2.967666/sqrt(9027)

.03123513

·        SE=SD/(sqrt(n)). This is a crucial relationship, one that you need to know.

 

. ttest yrsed if age>=25 & age<=34, by(sex)

 

Two-sample t test with equal variances

------------------------------------------------------------------------------

   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+--------------------------------------------------------------------

    Male |    9027    13.31212    .0312351    2.967666    13.25089    13.37335

  Female |    9511    13.55657    .0292693    2.854472    13.49919    13.61394

---------+--------------------------------------------------------------------

combined |   18538    13.43753    .0213921    2.912627     13.3956    13.47946

---------+--------------------------------------------------------------------

    diff |           -.2444469    .0427623               -.3282649   -.1606289

------------------------------------------------------------------------------

    diff = mean(Male) - mean(Female)                              t =  -5.7164

Ho: diff = 0                                     degrees of freedom =    18536

 

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

 

 

. display -0.2444469/0.0427623

-5.7164114

 

·        Also, and crucially, T=difference/(SE of difference)

 

. display ttail(18536,-5.7164)

.99999999

·        Lookup the stata function ttail. It gives the probability from that value and higher, so in this case -5.7 and above. We want tail value, so we want either 1-ttail(df, -5.7) or ttail (df, 5.7).

 

. display (1-ttail(18356,-5.7164))

5.525e-09

 

. display 2*(1-ttail(18356,-5.7164))

1.105e-08

 

. display 2*(ttail(18356, 5.7164))

1.105e-08

 

. display 2*(1-normal(5.7164))

1.088e-08

* Normal probability takes the cumulative probability up to that point, which is (for no logical reason) the opposite of ttail syntax. So here we have 1-normal(5.7), times 2 because we have two tails. And note, the normal probability is similar to, but not exactly the same as the t probability with 18K degrees of freedom.

 

. display invnormal(1-.025)

1.959964

     * 1.96 is the critical value of the normal, with upper and lower tails adding up to 5% probability.

 

. display invttail(2, 0.025)

4.3026527

     * For small df, the t distribution critical value is much higher, but as df grows, the t distribution becomes indistinguishable from the normal.

 

. display invttail(100, 0.025)

1.9839715

 

. display invttail(10000, 0.025)

1.9602012

 

. log close

      name:  <unnamed>

       log:  C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class3.log

  log type:  text

 closed on:   1 Oct 2018, 12:30:48

-----------------------------------------------------------------------------------------------------