. use "C:\Users\mexmi\Desktop\cps_mar_2000_new.dta", clear

·        Let’s start with a regression using categorical metro as the predictor, since this is a regression we are familiar with.

. regress incwage i.metro if age>29 & age<65 & sex==1 & metro~=0

Source |       SS       df       MS              Number of obs =   29241

-------------+------------------------------           F(  3, 29237) =  252.70

Model |  1.1296e+12     3  3.7652e+11           Prob > F      =  0.0000

Residual |  4.3563e+13 29237  1.4900e+09           R-squared     =  0.0253

Total |  4.4692e+13 29240  1.5285e+09           Root MSE      =   38600

incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

metro |

Central city  |   7255.712   668.0533    10.86   0.000     5946.297    8565.127

Outside central city  |   16013.39   593.9852    26.96   0.000     14849.15    17177.63

Central city status unknown  |   8368.313   758.7058    11.03   0.000     6881.216    9855.411

|

_cons |   27189.65   474.1327    57.35   0.000     26260.33    28118.97

·        Running the regression stores a bunch of things in Stata’s memory, which we can access until we run another estimation command.

. codebook metro

metro                                                                Metropolitan central city status

type:  numeric (byte)

label:  metrolbl

range:  [0,4]                        units:  1

unique values:  5                        missing .:  0/133710

tabulation:  Freq.   Numeric  Label

340         0  Not identifiable

29658         1  Not in metro area

32481         2  Central city

51468         3  Outside central city

19763         4  Central city status unknown

. lincom 3.metro-2.metro

( 1)  - 2.metro + 3.metro = 0

incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

(1) |   8757.676   591.1938    14.81   0.000      7598.91    9916.443

* For a reminder, the income difference between cities and suburbs is \$8757. But how does Stata figure the SE of the difference?

. matrix VC=e(V)

·        E(V) is the variance covariance matrix stored in memory after the regress function. Here we are creating a matrix equal to e(V), and calling it VC.

. matrix list VC

symmetric VC[5,5]

1b.          2.          3.          4.

metro       metro       metro       metro       _cons

1b.metro           0

2.metro           0    446295.2

3.metro           0   224801.78   352818.47

4.metro           0   224801.78   224801.78   575634.42

_cons           0  -224801.78  -224801.78  -224801.78   224801.78

·        Because the variance-covariance matrix is symmetric, because Cov (X1,X2)=Cov(X2,X1), stata only gives us the main diagonal and the lower triangle.

. display 446295^0.5

668.05314

* The SE of each coefficient is the square root of its variance in the variance-covariance matrix.

. display (VC[2,2]+ VC[3,3]-(2*VC[2,3]))^0.5

591.19379

*SE of the difference between X1 and X2, two predictors, is sqrt(var(X1)+var(X2)-2(cov(X1,X2)), see my pdf file on the means and variance. Note that this 591 is the same as Stata generated for the SE of the difference when lincom was invoked.

