--------------------------------------------------------------------------------------------

       log:  C:\AAA Miker Files\newer web pages\soc_388_notes\soc_388_2003\clogg and eliason

>  log.log

  log type:  text

 opened on:  17 Nov 2003, 11:11:08

 

. use "C:\AAA Miker Files\newer web pages\soc_388_notes\clogg and eliason data.dta", clear

 

. table color labor, contents(sum uwcount sum wtcount) by(sex)

 

----------------------------------------------

sex and   |               labor              

color     | unemployed   part-time       other

----------+-----------------------------------

male      |

    white |       3511        4227       31467

          |    5024241     5951616    4.43e+07

          |

    black |        604         356        2245

          |    1160284      658244     3960180

          |

    other |        165         157         924

          |     169785      176468     1134672

----------+-----------------------------------

female    |

    white |       2281        7833       18945

          |    3179714    1.08e+07    2.66e+07

          |

    black |        545         563        2132

          |     929225      916001     3556176

          |

    other |         89         216         725

          |      91581      231120      817075

----------------------------------------------

 

. table color labor, contents(sum uwcount sum weight sum wtcount) by(sex)

 

----------------------------------------------

sex and   |               labor              

color     | unemployed   part-time       other

----------+-----------------------------------

male      |

    white |       3511        4227       31467

          |       1431        1408        1408

          |    5024241     5951616    4.43e+07

          |

    black |        604         356        2245

          |       1921        1849        1764

          |    1160284      658244     3960180

          |

    other |        165         157         924

          |       1029        1124        1228

          |     169785      176468     1134672

----------+-----------------------------------

female    |

    white |       2281        7833       18945

          |       1394        1373        1405

          |    3179714    1.08e+07    2.66e+07

          |

    black |        545         563        2132

          |       1705        1627        1668

          |     929225      916001     3556176

          |

    other |         89         216         725

          |       1029        1070        1127

          |      91581      231120      817075

----------------------------------------------

 

. table color, contents (mean weight)

 

------------------------

    color | mean(weight)

----------+-------------

    white |      1403.17

    black |      1755.67

    other |      1101.17

------------------------

 

. *The weights are not uniform.

. *uniform weights would have no effect on the model.

. *The reason that the CPS has weights is to reflect the fact that some populations have lower

 rates of response to the survey than other populations.

. *The weight that we're using here is something like the inverse of the sampling frequency.

 Blacks are sampled at a lower rate because they're less likely to respond to the survey,

so those Blacks that do respond to the CPS get a higher weight in the CPS.

. desmat: poisson uwcount labor*sex labor*color sex*color, desmat(zval)

option desmat() not allowed

r(198);

 

. desmat: poisson uwcount labor*sex labor*color sex*color, desrep(zval)

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                             uwcount

   Optimization:                                                                       ml

   Number of observations:                                                             18

   Initial log likelihood:                                                     -81627.074

   Log likelihood:                                                               -123.390

   LR chi square:                                                              163007.367

   Model degrees of freedom:                                                           13

   Pseudo R-squared:                                                                0.998

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                     Coeff        s.e.       z 

------------------------------------------------------------------------------------------

   uwcount

     labor

1      part-time                                              0.210**     0.022     9.583

2      other                                                  2.182**     0.017   127.658

     sex

3      female                                                -0.446**     0.025   -18.192

     labor.sex

4      part-time.female                                       1.017**     0.030    33.621

5      other.female                                          -0.049       0.026    -1.891

     color

6      black                                                 -1.771**     0.035   -51.143

7      other                                                 -3.178**     0.067   -47.667

     labor.color

8      part-time.black                                       -1.043**     0.048   -21.928

9      part-time.other                                       -0.380**     0.084    -4.550

10     other.black                                           -0.822**     0.036   -22.834

11     other.other                                           -0.292**     0.069    -4.238

     sex.color

12     female.black                                           0.354**     0.027    13.279

13     female.other                                           0.126**     0.044     2.891

14   _cons                                                    8.170**     0.016   502.506

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  86.53056

         Prob > chi2(4)        =    0.0000

 

. poisgof, pearson

 

         Goodness-of-fit chi2  =  89.79915

         Prob > chi2(4)        =    0.0000

 

. *By summary statistics, you can tell this is C+E's number 1.  But the Z values don't correspond to their reported Z values.

 That's because they use deviation coding and exclude the highest group

. desmat: poisson uwcount labor*sex labor*color sex*color, defcon(dev(3)) desrep(zval)

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                             uwcount

   Optimization:                                                                       ml

   Number of observations:                                                             18

   Initial log likelihood:                                                     -81627.074

   Log likelihood:                                                               -123.390

   LR chi square:                                                              163007.367

   Model degrees of freedom:                                                           13

   Pseudo R-squared:                                                                0.998

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                     Coeff        s.e.       z 

------------------------------------------------------------------------------------------

   uwcount

     labor

1      unemployed                                            -0.677**     0.018   -38.575

2      part-time                                             -0.433**     0.017   -26.205

     sex

3      male                                                  -0.018*      0.009    -2.004

     labor.sex

4      unemployed.male                                        0.161**     0.009    18.482

5      part-time.male                                        -0.347**     0.007   -46.843

     color

6      white                                                  1.852**     0.011   162.348

7      black                                                 -0.364**     0.014   -25.659

     labor.color

8      unemployed.white                                      -0.282**     0.018   -15.386

9      unemployed.black                                       0.340**     0.022    15.425

10     part-time.white                                        0.193**     0.017    11.300

11     part-time.black                                       -0.229**     0.022   -10.465

     sex.color

12     male.white                                             0.080**     0.009     9.162

13     male.black                                            -0.097**     0.011    -8.679

14   _cons                                                    7.053**     0.011   640.281

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. *It's the same model, but with the dummy variable coding that is the same as the coding

> that C+E use.  That's deviation coding, see the desmat help file.

. poisgof

 

         Goodness-of-fit chi2  =  86.53056

         Prob > chi2(4)        =    0.0000

 

. poisgof, pearson.

option pearson. not allowed

r(198);

 

. poisgof, pearson

 

         Goodness-of-fit chi2  =  89.79915

         Prob > chi2(4)        =    0.0000

 

. *Same model, #1

. *Number two, which is not reported in Clogg and Eliason, is to inflate the count by the

> weight and use the new huge weighted counts as dependent variable

. desmat: poisson wtcount labor*sex labor*color sex*color, defcon(dev(3)) desrep(zval)

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                             wtcount

   Optimization:                                                                       ml

   Number of observations:                                                             18

   Initial log likelihood:                                                     -1.137e+08

   Log likelihood:                                                             -71806.645

   LR chi square:                                                               2.273e+08

   Model degrees of freedom:                                                           13

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                     Coeff        s.e.       z 

------------------------------------------------------------------------------------------

   wtcount

     labor

1      unemployed                                            -0.689**     0.001 -1329.498

2      part-time                                             -0.435**     0.000  -915.000

     sex

3      male                                                   0.013**     0.000    50.973

     labor.sex

4      unemployed.male                                        0.165**     0.000   717.857

5      part-time.male                                        -0.340**     0.000 -1735.469

     color

6      white                                                  1.858**     0.000  5625.057

7      black                                                 -0.133**     0.000  -345.910

     labor.color

8      unemployed.white                                      -0.263**     0.001  -490.386

9      unemployed.black                                       0.383**     0.001   631.891

10     part-time.white                                        0.186**     0.000   380.579

11     part-time.black                                       -0.231**     0.001  -394.068

     sex.color

12     male.white                                             0.059**     0.000   240.222

13     male.black                                            -0.083**     0.000  -280.576

14   _cons                                                   14.294**     0.000 44573.248

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  143357.3

         Prob > chi2(4)        =    0.0000

 

. poisgof, pearson

 

         Goodness-of-fit chi2  =  148750.5

         Prob > chi2(4)        =    0.0000

 

. *Use of the weighted counts (here the weights are huge) does 2 things.

. *It shrinks the SE to miniscule values

. *It inflates the GOF test way out of town.

* use of weights in this way is the most wrong.

. desmat, sigcut (0.001 0.0000001) sigsym(*** *&*)

varlist not allowed

r(101);

 

. desrep, sigcut (0.001 0.0000001) sigsym(*** *&*)

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                             wtcount

   Optimization:                                                                       ml

   Number of observations:                                                             18

   Initial log likelihood:                                                     -1.137e+08

   Log likelihood:                                                             -71806.645

   LR chi square:                                                               2.273e+08

   Model degrees of freedom:                                                           13

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                              Coeff         s.e.

------------------------------------------------------------------------------------------

   wtcount

1    _x_1                                                             -0.689*&*     0.001

2    _x_2                                                             -0.435*&*     0.000

3    _x_3                                                              0.013*&*     0.000

4    _x_4                                                              0.165*&*     0.000

5    _x_5                                                             -0.340*&*     0.000

6    _x_6                                                              1.858*&*     0.000

7    _x_7                                                             -0.133*&*     0.000

8    _x_8                                                             -0.263*&*     0.001

9    _x_9                                                              0.383*&*     0.001

10   _x_10                                                             0.186*&*     0.000

11   _x_11                                                            -0.231*&*     0.001

12   _x_12                                                             0.059*&*     0.000

13   _x_13                                                            -0.083*&*     0.000

14   _cons                                                            14.294*&*     0.000

------------------------------------------------------------------------------------------

*** p < .001

*&* p < 1.00000000e-07

 

. *You can do whatever you want with the cutoffs and symbols using desrep

. desrep, zval

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                             wtcount

   Optimization:                                                                       ml

   Number of observations:                                                             18

   Initial log likelihood:                                                     -1.137e+08

   Log likelihood:                                                             -71806.645

   LR chi square:                                                               2.273e+08

   Model degrees of freedom:                                                           13

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                     Coeff        s.e.       z 

------------------------------------------------------------------------------------------

   wtcount

1    _x_1                                                    -0.689**     0.001 -1329.498

2    _x_2                                                    -0.435**     0.000  -915.000

3    _x_3                                                     0.013**     0.000    50.973

4    _x_4                                                     0.165**     0.000   717.857

5    _x_5                                                    -0.340**     0.000 -1735.469

6    _x_6                                                     1.858**     0.000  5625.057

7    _x_7                                                    -0.133**     0.000  -345.910

8    _x_8                                                    -0.263**     0.001  -490.386

9    _x_9                                                     0.383**     0.001   631.891

10   _x_10                                                    0.186**     0.000   380.579

11   _x_11                                                   -0.231**     0.001  -394.068

12   _x_12                                                    0.059**     0.000   240.222

13   _x_13                                                   -0.083**     0.000  -280.576

14   _cons                                                   14.294**     0.000 44573.248

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. *The model that uses wtcount as the dependent variable actually has the correct coefficients

. desmat: poisson wtcount labor*sex labor*color sex*color, defcon(dev(3)) desrep(zval)

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                             wtcount

   Optimization:                                                                       ml

   Number of observations:                                                             18

   Initial log likelihood:                                                     -1.137e+08

   Log likelihood:                                                             -71806.645

   LR chi square:                                                               2.273e+08

   Model degrees of freedom:                                                           13

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                     Coeff        s.e.       z 

------------------------------------------------------------------------------------------

   wtcount

     labor

1      unemployed                                            -0.689**     0.001 -1329.498

2      part-time                                             -0.435**     0.000  -915.000

     sex

3      male                                                   0.013**     0.000    50.973

     labor.sex

4      unemployed.male                                        0.165**     0.000   717.857

5      part-time.male                                        -0.340**     0.000 -1735.469

     color

6      white                                                  1.858**     0.000  5625.057

7      black                                                 -0.133**     0.000  -345.910

     labor.color

8      unemployed.white                                      -0.263**     0.001  -490.386

9      unemployed.black                                       0.383**     0.001   631.891

10     part-time.white                                        0.186**     0.000   380.579

11     part-time.black                                       -0.231**     0.001  -394.068

     sex.color

12     male.white                                             0.059**     0.000   240.222

13     male.black                                            -0.083**     0.000  -280.576

14   _cons                                                   14.294**     0.000 44573.248

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. *If you look at this disastrous model, the coefficients are actually correct, compare to the final column of C+E table 6

. *That might lead a person to simply rescale the weights, and apply the rescaled weights*

>  uwcount as the new depvar.  That would be Clogg+ Eliason # 2

. desmat: poisson  wt_count_rescale labor*sex labor*color sex*color, defcon(dev(3)) desrep

> (zval)

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                    wt_count_rescale

   Optimization:                                                                       ml

   Number of observations:                                                             18

   Initial log likelihood:                                                     -80153.975

   Log likelihood:                                                               -130.414

   LR chi square:                                                              160047.123

   Model degrees of freedom:                                                           13

   Pseudo R-squared:                                                                0.998

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                     Coeff        s.e.       z 

------------------------------------------------------------------------------------------

   wt_count_rescale

     labor

1      unemployed                                            -0.689**     0.020   -35.281

2      part-time                                             -0.435**     0.018   -24.282

     sex

3      male                                                   0.013       0.010     1.353

     labor.sex

4      unemployed.male                                        0.165**     0.009    19.050

5      part-time.male                                        -0.340**     0.007   -46.055

     color

6      white                                                  1.858**     0.012   149.274

7      black                                                 -0.133**     0.015    -9.180

     labor.color

8      unemployed.white                                      -0.263**     0.020   -13.013

9      unemployed.black                                       0.383**     0.023    16.769

10     part-time.white                                        0.186**     0.018    10.100

11     part-time.black                                       -0.231**     0.022   -10.457

     sex.color

12     male.white                                             0.059**     0.009     6.375

13     male.black                                            -0.083**     0.011    -7.446

14   _cons                                                    7.036**     0.012   582.209

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. *Okay.  This model has the correct coefficients, because it takes the weights into accou

> nt.  And it has standard errors that are not crazy, because the scale of the dataset ref

> lects the actual scale of unweighted counts.

. *But the SE are still not quite right.

. poisgof

 

         Goodness-of-fit chi2  =  100.9525

         Prob > chi2(4)        =    0.0000

 

. poisgof, pearson

 

         Goodness-of-fit chi2  =  104.7538

         Prob > chi2(4)        =    0.0000

 

. *The way Stata takes the weights into account is through an option called exposure.

. desmat: poisson  uwcount labor*sex labor*color sex*color, exposure (invweight) defcon(de

> v(3)) desrep(zval)

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                             uwcount

   Optimization:                                                                       ml

   Number of observations:                                                             18

   Initial log likelihood:                                                     -84619.027

   Log likelihood:                                                               -124.919

   LR chi square:                                                              168988.216

   Model degrees of freedom:                                                           13

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                     Coeff        s.e.       z 

------------------------------------------------------------------------------------------

   uwcount

     labor

1      unemployed                                            -0.688**     0.018   -39.155

2      part-time                                             -0.440**     0.017   -26.614

     sex

3      male                                                   0.013       0.009     1.381

     labor.sex

4      unemployed.male                                        0.166**     0.009    18.987

5      part-time.male                                        -0.343**     0.007   -46.302

     color

6      white                                                  1.860**     0.011   163.037

7      black                                                 -0.136**     0.014    -9.590

     labor.color

8      unemployed.white                                      -0.265**     0.018   -14.436

9      unemployed.black                                       0.386**     0.022    17.495

10     part-time.white                                        0.191**     0.017    11.175

11     part-time.black                                       -0.238**     0.022   -10.861

     sex.color

12     male.white                                             0.058**     0.009     6.705

13     male.black                                            -0.085**     0.011    -7.611

14   _cons                                                   14.292**     0.011  1296.842

     ln(invweight)                                                     (offset)

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  89.58815

         Prob > chi2(4)        =    0.0000

 

. poisgof, pearson

 

         Goodness-of-fit chi2  =  93.54537

         Prob > chi2(4)        =    0.0000

 

. *Here what you have is a model that has the coefficients of models 2 and 3, but the standard errors of model # 1

. exit, clear