------------------------------------------------------------------------------------------
name: <unnamed>
log: C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2021_logs\class4
> .log
log type: text
opened on: 29 Sep 2021, 10:10:42
. use "C:\Users\mexmi\Desktop\cps_mar_2000_new.dta"
. *class starts here
. drop female nurses lawyers
*dropping some variables I created before class so that I can recreate them during class.
. ttest yrsed if age>=25 & age<=34, by(sex)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. err. Std. dev. [95% conf. interval]
---------+--------------------------------------------------------------------
Male | 9,027 13.31212 .0312351 2.967666 13.25089 13.37335
Female | 9,511 13.55657 .0292693 2.854472 13.49919 13.61394
---------+--------------------------------------------------------------------
Combined | 18,538 13.43753 .0213921 2.912627 13.3956 13.47946
---------+--------------------------------------------------------------------
diff | -.2444469 .0427623 -.3282649 -.1606289
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -5.7164
H0: diff = 0 Degrees of freedom = 18536
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
. ttest yrsed if age>=25 & age<=34, by(sex) unequal
Two-sample t test with unequal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. err. Std. dev. [95% conf. interval]
---------+--------------------------------------------------------------------
Male | 9,027 13.31212 .0312351 2.967666 13.25089 13.37335
Female | 9,511 13.55657 .0292693 2.854472 13.49919 13.61394
---------+--------------------------------------------------------------------
Combined | 18,538 13.43753 .0213921 2.912627 13.3956 13.47946
---------+--------------------------------------------------------------------
diff | -.2444469 .0428057 -.32835 -.1605438
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -5.7106
H0: diff = 0 Satterthwaite's degrees of freedom = 18383.6
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
* we have an unequal and an equal variance t-test. Which one is going to be exactly the same as the regression results?
. codebook sex
-----------------------------------------------------------------------------------
sex Sex
-----------------------------------------------------------------------------------
Type: Numeric (byte)
Label: sexlbl
Range: [1,2] Units: 1
Unique values: 2 Missing .: 0/133,710
Tabulation: Freq. Numeric Label
64,791 1 Male
68,919 2 Female
* Sex is arbitrarily coded, we need to make a dummy variable for gender rather than using the 1-2 coded existing variable which, if we mistakenly told Stata to take the numbers as they are, would indicate that women have twice as much of whatever is being coded as men do. In order to deal with categorical variables like sex we either need to make dummy variables or Stata needs to make them for us (more on that later!)
. gen byte female=0
. replace female=1 if sex==2
(68,919 real changes made)
. tabulate sex female, miss
| female
Sex | 0 1 | Total
-----------+----------------------+----------
Male | 64,791 0 | 64,791
Female | 0 68,919 | 68,919
-----------+----------------------+----------
Total | 64,791 68,919 | 133,710
. regress yrsed female if age>=25 & age<=34
Source | SS df MS Number of obs = 18,538
-------------+---------------------------------- F(1, 18536) = 32.68
Model | 276.742433 1 276.742433 Prob > F = 0.0000
Residual | 156979.922 18,536 8.46892111 R-squared = 0.0018
-------------+---------------------------------- Adj R-squared = 0.0017
Total | 157256.664 18,537 8.48339343 Root MSE = 2.9101
------------------------------------------------------------------------------
yrsed | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
female | .2444469 .0427623 5.72 0.000 .1606289 .3282649
_cons | 13.31212 .0306297 434.62 0.000 13.25208 13.37216
------------------------------------------------------------------------------
. display 0.2444469/0.04276226
5.7164168
* Regression gives results that are exactly the same as the equal variance t-test (compare it from above)
. graph box age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)
. graph hbox age if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)
* Box plot gives us a box with the 25th, 50th, and 75th percentiles. You can ignore the whiskers and dots that represent outliers. If you want to know exactly what the 25th, 50th, and 75th percentiles equal, the table command can tell you.
. table occ1990 if occ1990==178| occ1990==95 | occ1990==125, stat(freq) stat(p25 age) stat(p50 age) stat(p75 age)
------------------------------------------------------------------------------------------
| Frequency 25th percentile 50th percentile 75th percentile
------------------------+-----------------------------------------------------------------
Occupation, 1990 basis |
Registered nurses | 966 36 43 51
Sociology instructors | 6 50 53 54
Lawyers | 441 35 43 52
Total | 1,413 35 43 51
------------------------------------------------------------------------------------------
. graph box inctot if occ1990==178| occ1990==95 | occ1990==125, over (occ1990)
. table occ1990 if occ1990==178| occ1990==95 | occ1990==125, stat(freq) stat(p25 inctot) stat(p50 inctot) stat(p75 inctot)
------------------------------------------------------------------------------------------
| Frequency 25th percentile 50th percentile 75th percentile
------------------------+-----------------------------------------------------------------
Occupation, 1990 basis |
Registered nurses | 966 27194 39144 50100
Sociology instructors | 6 39360 45326 49162
Lawyers | 441 47133 82515 125725
Total | 1,413 30081 44840 67639
------------------------------------------------------------------------------------------
* There is a lot less overlap in income than in age across these 3 occupations.
. gen byte nurses=0
. replace nurses=1 if occ1990==95
(966 real changes made)
. gen byte lawyers=0
. replace lawyers=1 if occ1990==178
(441 real changes made)
. regress inctot nurses lawyers
Source | SS df MS Number of obs = 103,226
-------------+---------------------------------- F(2, 103223) = 1294.98
Model | 2.5972e+12 2 1.2986e+12 Prob > F = 0.0000
Residual | 1.0351e+14 103,223 1.0028e+09 R-squared = 0.0245
-------------+---------------------------------- Adj R-squared = 0.0245
Total | 1.0611e+14 103,225 1.0279e+09 Root MSE = 31667
------------------------------------------------------------------------------
inctot | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
nurses | 15233.13 1023.69 14.88 0.000 13226.71 17239.55
lawyers | 73688.55 1511.213 48.76 0.000 70726.59 76650.51
_cons | 25554.04 99.24125 257.49 0.000 25359.52 25748.55
------------------------------------------------------------------------------
* In this regression above we are comparing nurses and lawyers each to all other persons in the CPS with nonmissing total income for 1999.
. regress inctot nurses lawyers if occ1990==178| occ1990==95| occ1990==125
Source | SS df MS Number of obs = 1,413
-------------+---------------------------------- F(2, 1410) = 262.68
Model | 1.0359e+12 2 5.1795e+11 Prob > F = 0.0000
Residual | 2.7802e+12 1,410 1.9718e+09 R-squared = 0.2715
-------------+---------------------------------- Adj R-squared = 0.2704
Total | 3.8161e+12 1,412 2.7026e+09 Root MSE = 44405
------------------------------------------------------------------------------
inctot | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
nurses | -3576.166 18184.46 -0.20 0.844 -39247.68 32095.35
lawyers | 54879.25 18251.16 3.01 0.003 19076.91 90681.59
_cons | 44363.33 18128.25 2.45 0.015 8802.086 79924.58
------------------------------------------------------------------------------
* Here above we are comparing nurses and lawyers to sociologists (all others excluded by the “if”).
. table occ1990 if occ1990==178| occ1990==95 | occ1990==125, stat(freq) stat(p25 inctot) stat(p50 inctot) stat(p75 inctot) stat(mean inctot)
-----------------------------------------------------------------------------------------------------
| Frequency 25th percentile 50th percentile 75th percentile Mean
------------------------+----------------------------------------------------------------------------
Occupation, 1990 basis |
Registered nurses | 966 27194 39144 50100 40787.17
Sociology instructors | 6 39360 45326 49162 44363.33
Lawyers | 441 47133 82515 125725 99242.58
Total | 1,413 30081 44840 67639 59046.4
-----------------------------------------------------------------------------------------------------
. table occ1990 if occ1990==178| occ1990==95 | occ1990==125, stat(mean inctot) stat(freq) stat(p25 inctot) stat(p50 inctot) stat(p75 inctot)
-----------------------------------------------------------------------------------------------------
| Mean Frequency 25th percentile 50th percentile 75th percentile
------------------------+----------------------------------------------------------------------------
Occupation, 1990 basis |
Registered nurses | 40787.17 966 27194 39144 50100
Sociology instructors | 44363.33 6 39360 45326 49162
Lawyers | 99242.58 441 47133 82515 125725
Total | 59046.4 1,413 30081 44840 67639
-----------------------------------------------------------------------------------------------------
* For reference, what is the inctot average for the 3 occupations (in the homework you will be using incwage, which is a little different).
. log close
name: <unnamed>
log: C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2021_logs\class4.l
> og
log type: text
closed on: 4 Oct 2021, 14:05:13
--------------------------------------------------------------------------------------------