-----------------------------------------------------------------------------------------------------
name: <unnamed>
log: C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class5.log
log type: text
opened on: 8 Oct 2018, 09:45:16
. use "C:\Users\mexmi\Desktop\cps_mar_2000_new.dta", clear
. *class starts here
. ttest yrsed if age>=25 & age<=34, by(sex)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Male | 9027 13.31212 .0312351 2.967666 13.25089 13.37335
Female | 9511 13.55657 .0292693 2.854472 13.49919 13.61394
---------+--------------------------------------------------------------------
combined | 18538 13.43753 .0213921 2.912627 13.3956 13.47946
---------+--------------------------------------------------------------------
diff | -.2444469 .0427623 -.3282649 -.1606289
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -5.7164
Ho: diff = 0 degrees of freedom = 18536
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
. ttest yrsed if age>=25 & age<=34, by(sex) unequal
Two-sample t test with unequal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Male | 9027 13.31212 .0312351 2.967666 13.25089 13.37335
Female | 9511 13.55657 .0292693 2.854472 13.49919 13.61394
---------+--------------------------------------------------------------------
combined | 18538 13.43753 .0213921 2.912627 13.3956 13.47946
---------+--------------------------------------------------------------------
diff | -.2444469 .0428057 -.32835 -.1605438
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -5.7106
Ho: diff = 0 Satterthwaite's degrees of freedom = 18383.6
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
· Note the subtle difference in std error of the difference between the equal variance and unequal variance t-test. The difference is subtle because the variance men’s education and the variance of women’s education is almost the same, so it hardly matters if we assume equal variance or not. The difference between the means is of course identical in both cases.
. display -.2444469/.0428057
-5.7106156
. display ttail(18536,-5.7164)
.99999999
. display (1-ttail(18356,-5.7164))
5.525e-09
. display 2*(1-ttail(18356,-5.7164))
1.105e-08
. codebook sex
-----------------------------------------------------------------------------------------------------
sex Sex
-----------------------------------------------------------------------------------------------------
type: numeric (byte)
label: sexlbl
range: [1,2] units: 1
unique values: 2 missing .: 0/133710
tabulation: Freq. Numeric Label
64791 1 Male
68919 2 Female
. gen byte female=0
. replace female=1 if sex==2
(68919 real changes made)
. tabulate sex female
| female
Sex | 0 1 | Total
-----------+----------------------+----------
Male | 64,791 0 | 64,791
Female | 0 68,919 | 68,919
-----------+----------------------+----------
Total | 64,791 68,919 | 133,710
· In order to use sex in a regression, we need to construct a 0-1 dummy variable for it, to take the place of the 1-2 values in the variable.
. regress yrsed female if age>=25 & age<=34
Source | SS df MS Number of obs = 18538
-------------+------------------------------ F( 1, 18536) = 32.68
Model | 276.742433 1 276.742433 Prob > F = 0.0000
Residual | 156979.922 18536 8.46892111 R-squared = 0.0018
-------------+------------------------------ Adj R-squared = 0.0017
Total | 157256.664 18537 8.48339343 Root MSE = 2.9101
------------------------------------------------------------------------------
yrsed | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | .2444469 .0427623 5.72 0.000 .1606289 .3282649
_cons | 13.31212 .0306297 434.62 0.000 13.25208 13.37216
· Note that the constant term has a t-test associated with it, but not a very interesting one. The null hypothesis for the constant term is that men’s average educational attainment is zero, which cannot be true.
· If we want the SD, SE, t-stat or other things to more precision, Stata stores them and we can call them up.
. matrix var_covar_regress=e(V)
. matrix list var_covar_regress
symmetric var_covar_regress[2,2]
female _cons
female .00182861
_cons -.00093818 .00093818
. display var_covar_regress[1,1]^0.5
.04276226
. display 0.2444469/0.04276226
5.7164168
· Fun with boxplot! Look up the definitions of box plot in the stata documentation, to see what the parts of the boxes mean. You can ignore the outliers.
. graph box age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)
. graph hbox age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)
· In order to get the percentiles of each distribution, you can use summarize, detail or table.
. summarize age if occ1990==178, detail
Age
-------------------------------------------------------------
Percentiles Smallest
1% 24 24
5% 27 24
10% 29 24 Obs 441
25% 35 24 Sum of Wgt. 441
50% 43 Mean 44.38549
Largest Std. Dev. 12.48585
75% 52 84
90% 61 86 Variance 155.8965
95% 66 87 Skewness .7190904
99% 83 90 Kurtosis 3.549932
. table occ1990 if occ1990==178| occ1990==95 | occ1990==125, contents(freq p25 age p50 age p75 age)
----------------------------------------------------------------------
Occupation, 1990 |
basis | Freq. p25(age) med(age) p75(age)
----------------------+-----------------------------------------------
Registered nurses | 966 36 43 51
Sociology instructors | 6 50 53 54
Lawyers | 441 35 43 52
----------------------------------------------------------------------
. ttest yrsed if age>=25 & age<=34, by(sex)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Male | 9027 13.31212 .0312351 2.967666 13.25089 13.37335
Female | 9511 13.55657 .0292693 2.854472 13.49919 13.61394
---------+--------------------------------------------------------------------
combined | 18538 13.43753 .0213921 2.912627 13.3956 13.47946
---------+--------------------------------------------------------------------
diff | -.2444469 .0427623 -.3282649 -.1606289
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -5.7164
Ho: diff = 0 degrees of freedom = 18536
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
. gen months_ed=yrsed*12
(30484 missing values generated)
· Rescaling the variable changes the means, SDs and SEs, but not the t-stat.
. ttest months_ed if age>=25 & age<=34, by(sex)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Male | 9027 159.7454 .3748215 35.61199 159.0107 160.4802
Female | 9511 162.6788 .3512319 34.25366 161.9903 163.3673
---------+--------------------------------------------------------------------
combined | 18538 161.2504 .2567052 34.95152 160.7472 161.7536
---------+--------------------------------------------------------------------
diff | -2.933363 .5131471 -3.939178 -1.927547
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -5.7164
Ho: diff = 0 degrees of freedom = 18536
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
. log close
name: <unnamed>
log: C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class5.log
log type: text
closed on: 8 Oct 2018, 12:23:18
-----------------------------------------------------------------------------------------------------