-----------------------------------------------------------------------------------------------
name: <unnamed>
log: C:\Users\Michael\Documents\newer web pages\soc_meth_proj3\fall_2016_logs\class4.lo
> g
log type: text
opened on: 5 Oct 2016, 10:03:59
. use "C:\Users\Michael\Desktop\cps_mar_2000_new_unchanged.dta", clear
* How to make a box plot: See below commands and look up the Stata documentation if necessary.
. graph box age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)
. graph hbox age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)
* Several ways to get percentiles of distributions:
. summarize age if occ1990==178, detail
Age
-------------------------------------------------------------
Percentiles Smallest
1% 24 24
5% 27 24
10% 29 24 Obs 441
25% 35 24 Sum of Wgt. 441
50% 43 Mean 44.38549
Largest Std. Dev. 12.48585
75% 52 84
90% 61 86 Variance 155.8965
95% 66 87 Skewness .7190904
99% 83 90 Kurtosis 3.549932
. table occ1990 if occ1990==178| occ1990==95 | occ1990==125, contents(freq p25 age p50 age p75 age)
----------------------------------------------------------------------
Occupation, 1990 |
basis | Freq. p25(age) med(age) p75(age)
----------------------+-----------------------------------------------
Registered nurses | 966 36 43 51
Sociology instructors | 6 50 53 54
Lawyers | 441 35 43 52
----------------------------------------------------------------------
*Among several ways to remind your self which occupational code is which:
. tabulate occ1990 if occ1990==178| occ1990==95 | occ1990==125
Occupation, 1990 basis | Freq. Percent Cum.
----------------------------------------+-----------------------------------
Registered nurses | 966 68.37 68.37
Sociology instructors | 6 0.42 68.79
Lawyers | 441 31.21 100.00
----------------------------------------+-----------------------------------
Total | 1,413 100.00
. tabulate occ1990 if occ1990==178| occ1990==95 | occ1990==125, nolab
Occupation, |
1990 basis | Freq. Percent Cum.
------------+-----------------------------------
95 | 966 68.37 68.37
125 | 6 0.42 68.79
178 | 441 31.21 100.00
------------+-----------------------------------
Total | 1,413 100.00
. ttest yrsed if age>=25 & age<=34, by(sex)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Male | 9027 13.31212 .0312351 2.967666 13.25089 13.37335
Female | 9511 13.55657 .0292693 2.854472 13.49919 13.61394
---------+--------------------------------------------------------------------
combined | 18538 13.43753 .0213921 2.912627 13.3956 13.47946
---------+--------------------------------------------------------------------
diff | -.2444469 .0427623 -.3282649 -.1606289
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -5.7164
Ho: diff = 0 degrees of freedom = 18536
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
* What if we rescale years of education to months of education? Would the t-test be the same? It should, because t-statistic is unit free. Notice what changes and what doesn’t change:
. gen months_ed=yrsed*12
(30484 missing values generated)
. ttest months_ed if age>=25 & age<=34, by(sex)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
Male | 9027 159.7454 .3748215 35.61199 159.0107 160.4802
Female | 9511 162.6788 .3512319 34.25366 161.9903 163.3673
---------+--------------------------------------------------------------------
combined | 18538 161.2504 .2567052 34.95152 160.7472 161.7536
---------+--------------------------------------------------------------------
diff | -2.933363 .5131471 -3.939178 -1.927547
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -5.7164
Ho: diff = 0 degrees of freedom = 18536
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
. exit, clear