-----------------------------------------------------------------------------------------------------
name: <unnamed>
log: C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class3.log
log type: text
opened on: 1 Oct 2018, 10:12:32
. use "C:\Users\mexmi\Desktop\cps_mar_2000_new.dta", clear
. *class starts here
. table sex if age>=25 & age<=34, contents (freq mean yrsed sd yrsed semean yrsed)
--------------------------------------------------------------
Sex | Freq. mean(yrsed) sd(yrsed) sem(yrsed)
----------+---------------------------------------------------
Male | 9,027 13.31212 2.967666 .0312351
Female | 9,511 13.55657 2.854472 .0292693
--------------------------------------------------------------
. display 2.967666/sqrt(9027)
.03123513
· SE=SD/(sqrt(n)). This is a crucial relationship, one that you need to know.
. ttest yrsed if age>=25 & age<=34, by(sex)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Male | 9027 13.31212 .0312351 2.967666 13.25089 13.37335
Female | 9511 13.55657 .0292693 2.854472 13.49919 13.61394
---------+--------------------------------------------------------------------
combined | 18538 13.43753 .0213921 2.912627 13.3956 13.47946
---------+--------------------------------------------------------------------
diff | -.2444469 .0427623 -.3282649 -.1606289
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -5.7164
Ho: diff = 0 degrees of freedom = 18536
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
. display -0.2444469/0.0427623
-5.7164114
· Also, and crucially, T=difference/(SE of difference)
. display ttail(18536,-5.7164)
.99999999
· Lookup the stata function ttail. It gives the probability from that value and higher, so in this case -5.7 and above. We want tail value, so we want either 1-ttail(df, -5.7) or ttail (df, 5.7).
. display (1-ttail(18356,-5.7164))
5.525e-09
. display 2*(1-ttail(18356,-5.7164))
1.105e-08
. display 2*(ttail(18356, 5.7164))
1.105e-08
. display 2*(1-normal(5.7164))
1.088e-08
* Normal probability takes the cumulative probability up to that point, which is (for no logical reason) the opposite of ttail syntax. So here we have 1-normal(5.7), times 2 because we have two tails. And note, the normal probability is similar to, but not exactly the same as the t probability with 18K degrees of freedom.
. display invnormal(1-.025)
1.959964
* 1.96 is the critical value of the normal, with upper and lower tails adding up to 5% probability.
. display invttail(2, 0.025)
4.3026527
* For small df, the t distribution critical value is much higher, but as df grows, the t distribution becomes indistinguishable from the normal.
. display invttail(100, 0.025)
1.9839715
. display invttail(10000, 0.025)
1.9602012
. log close
name: <unnamed>
log: C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class3.log
log type: text
closed on: 1 Oct 2018, 12:30:48
-----------------------------------------------------------------------------------------------------