---------------------------------------------------------------------------------

      name:  <unnamed>

       log:  C:\Documents and Settings\Michael Rosenfeld\My Documents\newer web p

> ages\soc_meth_proj3\2010_logs\class_eleven.log

  log type:  text

 opened on:   2 Mar 2010, 14:59:04

 

. use "C:\Documents and Settings\Michael Rosenfeld\Desktop\cps_mar_2000_new.dta",

>  clear

 

. regress incwage vietnam_vet male age  age_sq yrsed if age>=25 & age<=64 [aweight= perwt_rounded]

(sum of wgt is   1.4261e+08)

 

      Source |       SS       df       MS              Number of obs =   69305

-------------+------------------------------           F(  5, 69299) = 3127.96

       Model |  1.3427e+13     5  2.6853e+12           Prob > F      =  0.0000

    Residual |  5.9492e+13 69299   858488914           R-squared     =  0.1841

-------------+------------------------------           Adj R-squared =  0.1841

       Total |  7.2919e+13 69304  1.0522e+09           Root MSE      =   29300

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

 vietnam_vet |    1035.18   532.7493     1.94   0.052    -9.007979    2079.367

        male |   16607.58   228.9415    72.54   0.000     16158.85     17056.3

         age |   2848.096   87.34381    32.61   0.000     2676.902     3019.29

      age_sq |  -31.92762   .9924702   -32.17   0.000    -33.87286   -29.98238

       yrsed |   3540.933   38.50133    91.97   0.000      3465.47    3616.395

       _cons |   -88294.8   1901.336   -46.44   0.000    -92021.42   -84568.19

------------------------------------------------------------------------------

 

. *a hopefully familiar M5, from HW3. I invoke it because several students in the class made the following mistake when adding their own variables, that is they treated a categorical variable (in this case race) as if it were a continuous variable whose values really meant something. Stata doesn't know the difference, so you get the output below.

 

. regress incwage vietnam_vet male age  age_sq yrsed race if age>=25 & age<=64 [aweight= perwt_rounded]

(sum of wgt is   1.4261e+08)

 

      Source |       SS       df       MS              Number of obs =   69305

-------------+------------------------------           F(  6, 69298) = 2612.57

       Model |  1.3452e+13     6  2.2419e+12           Prob > F      =  0.0000

    Residual |  5.9467e+13 69298   858139311           R-squared     =  0.1845

-------------+------------------------------           Adj R-squared =  0.1844

       Total |  7.2919e+13 69304  1.0522e+09           Root MSE      =   29294

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

 vietnam_vet |   945.3285      532.9     1.77   0.076    -99.15456    1989.812

        male |   16596.75   228.9036    72.51   0.000      16148.1     17045.4

         age |   2844.023   87.32927    32.57   0.000     2672.858    3015.189

      age_sq |   -31.9051   .9922768   -32.15   0.000    -33.84996   -29.96024

       yrsed |   3546.516   38.50734    92.10   0.000     3471.042     3621.99

        race |  -5.353826   .9902242    -5.41   0.000    -7.294664   -3.412988

       _cons |   -87499.2   1906.635   -45.89   0.000     -91236.2    -83762.2

------------------------------------------------------------------------------

 

. *key error that many people made, was treating a categorical variable as if it were a continuous variable whose numbers really meant something.

 

. tabulate race

 

                                 Race |      Freq.     Percent        Cum.

--------------------------------------+-----------------------------------

                                White |    113,475       84.87       84.87

                          Black/Negro |     13,626       10.19       95.06

         American Indian/Aleut/Eskimo |      1,894        1.42       96.47

            Asian or Pacific Islander |      4,715        3.53      100.00

--------------------------------------+-----------------------------------

                                Total |    133,710      100.00

 

. tabulate race, nolab

 

       Race |      Freq.     Percent        Cum.

------------+-----------------------------------

        100 |    113,475       84.87       84.87

        200 |     13,626       10.19       95.06

        300 |      1,894        1.42       96.47

        650 |      4,715        3.53      100.00

------------+-----------------------------------

      Total |    133,710      100.00

 

*Proper syntax for a categorical variable is to put the "i." in front of the variable, to tell Stata to make the dummy variables. There are 4 categories, so we get 3 dummy variables. The below syntax is proper and correct. If the variable is categorical, and it is not already coded 0-1, then you need to tell Stata to make dummy variables.

 

. regress incwage vietnam_vet male age  age_sq yrsed i.race if age>=25 & age<=64 [aweight= perwt_rounded]

(sum of wgt is   1.4261e+08)

 

      Source |       SS       df       MS              Number of obs =   69305

-------------+------------------------------           F(  8, 69296) = 1982.42

       Model |  1.3580e+13     8  1.6976e+12           Prob > F      =  0.0000

    Residual |  5.9339e+13 69296   856305835           R-squared     =  0.1862

-------------+------------------------------           Adj R-squared =  0.1861

       Total |  7.2919e+13 69304  1.0522e+09           Root MSE      =   29263

 

------------------------------------------------------------------------------

     incwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

 vietnam_vet |   1009.395   532.4176     1.90   0.058     -34.1431    2052.932

        male |    16524.6   228.7384    72.24   0.000     16076.28    16972.93

         age |   2844.569   87.23672    32.61   0.000     2673.585    3015.553

      age_sq |   -31.9709   .9912431   -32.25   0.000    -33.91373   -30.02807

       yrsed |     3510.6   38.58087    90.99   0.000     3434.981    3586.218

             |

        race |

        200  |  -4288.975   342.9176   -12.51   0.000    -4961.093   -3616.857

        300  |  -6117.347   1183.335    -5.17   0.000    -8436.682   -3798.013

        650  |  -1226.535   562.7613    -2.18   0.029    -2329.546   -123.5237

             |

       _cons |  -86985.25   1901.461   -45.75   0.000    -90712.11   -83258.39

------------------------------------------------------------------------------

 

*Alternatively, we could use desmat:

 

 

. desmat: regress incwage vietnam_vet male @age  @age_sq @yrsed race if age>=25 & age<=64 [aweight= perwt_rounded]

---------------------------------------------------------------------------------

   Linear regression

---------------------------------------------------------------------------------

   Dependent variable                                                    incwage

   Number of observations:                                                 69305

   aweight:                                                        perwt_rounded

   F statistic:                                                         1982.417

   Model degrees of freedom:                                                   8

   Residual degrees of freedom:                                            69296

   R-squared:                                                              0.186

   Adjusted R-squared:                                                     0.186

   Root MSE                                                            29262.704

   Prob:                                                                   0.000

---------------------------------------------------------------------------------

nr Effect                                                      Coeff        s.e.

---------------------------------------------------------------------------------

   vietnam_vet

1    1                                                      1009.395     532.418

   male

2    male                                                  16524.603**   228.738

3  Age                                                      2844.569**    87.237

4  age_sq                                                    -31.971**     0.991

5  based on educrec                                         3510.600**    38.581

   race

6    Black/Negro                                           -4288.975**   342.918

7    American Indian/Aleut/Eskimo                          -6117.347**  1183.335

8    Asian or Pacific Islander                             -1226.535*    562.761

9  _cons                                                  -86985.250**  1901.461

---------------------------------------------------------------------------------

*  p < .05

** p < .01

 

.

 

. exit, clear