-----------------------------------------------------------------------------------------------------
name: <unnamed>
log: C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class9.log
log type: text
opened on: 22 Oct 2018, 10:24:23
. use "C:\Users\mexmi\Desktop\cps_mar_2000_new.dta", clear
· Let’s start with a regression using categorical metro as the predictor, since this is a regression we are familiar with.
. regress incwage i.metro if age>29 & age<65 & sex==1 & metro~=0
Source | SS df MS Number of obs = 29241
-------------+------------------------------ F( 3, 29237) = 252.70
Model | 1.1296e+12 3 3.7652e+11 Prob > F = 0.0000
Residual | 4.3563e+13 29237 1.4900e+09 R-squared = 0.0253
-------------+------------------------------ Adj R-squared = 0.0252
Total | 4.4692e+13 29240 1.5285e+09 Root MSE = 38600
----------------------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------------------------+----------------------------------------------------------------
metro |
Central city | 7255.712 668.0533 10.86 0.000 5946.297 8565.127
Outside central city | 16013.39 593.9852 26.96 0.000 14849.15 17177.63
Central city status unknown | 8368.313 758.7058 11.03 0.000 6881.216 9855.411
|
_cons | 27189.65 474.1327 57.35 0.000 26260.33 28118.97
· Running the regression stores a bunch of things in Stata’s memory, which we can access until we run another estimation command.
. codebook metro
-----------------------------------------------------------------------------------------------------
metro Metropolitan central city status
-----------------------------------------------------------------------------------------------------
type: numeric (byte)
label: metrolbl
range: [0,4] units: 1
unique values: 5 missing .: 0/133710
tabulation: Freq. Numeric Label
340 0 Not identifiable
29658 1 Not in metro area
32481 2 Central city
51468 3 Outside central city
19763 4 Central city status unknown
. lincom 3.metro-2.metro
( 1) - 2.metro + 3.metro = 0
------------------------------------------------------------------------------
incwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 8757.676 591.1938 14.81 0.000 7598.91 9916.443
* For a reminder, the income difference between cities and suburbs is $8757. But how does Stata figure the SE of the difference?
. matrix VC=e(V)
· E(V) is the variance covariance matrix stored in memory after the regress function. Here we are creating a matrix equal to e(V), and calling it VC.
. matrix list VC
symmetric VC[5,5]
1b. 2. 3. 4.
metro metro metro metro _cons
1b.metro 0
2.metro 0 446295.2
3.metro 0 224801.78 352818.47
4.metro 0 224801.78 224801.78 575634.42
_cons 0 -224801.78 -224801.78 -224801.78 224801.78
· Because the variance-covariance matrix is symmetric, because Cov (X1,X2)=Cov(X2,X1), stata only gives us the main diagonal and the lower triangle.
. display 446295^0.5
668.05314
* The SE of each coefficient is the square root of its variance in the variance-covariance matrix.
. display (VC[2,2]+ VC[3,3]-(2*VC[2,3]))^0.5
591.19379
*SE of the difference between X1 and X2, two predictors, is sqrt(var(X1)+var(X2)-2(cov(X1,X2)), see my pdf file on the means and variance. Note that this 591 is the same as Stata generated for the SE of the difference when lincom was invoked.
. log close
name: <unnamed>
log: C:\Users\mexmi\Documents\newer web pages\soc_meth_proj3\fall_2018_logs\class9.log
log type: text
closed on: 22 Oct 2018, 12:46:02
-----------------------------------------------------------------------------------------------------