--------------------------------------------------------------------------------------------
log: C:\AAA Miker
Files\newer web pages\soc_388_notes\soc_388_2003\clogg and eliason
>
log.log
log type: text
opened
on:
. use "C:\AAA Miker
Files\newer web pages\soc_388_notes\clogg and eliason data.dta", clear
. table color labor,
contents(sum uwcount sum wtcount) by(sex)
----------------------------------------------
sex and | labor
color | unemployed part-time other
----------+-----------------------------------
male |
white |
3511 4227 31467
| 5024241
5951616 4.43e+07
|
black |
604 356 2245
| 1160284
658244 3960180
|
other |
165 157 924
| 169785
176468 1134672
----------+-----------------------------------
female |
white |
2281 7833
18945
| 3179714
1.08e+07 2.66e+07
|
black |
545 563 2132
| 929225
916001 3556176
|
other |
89 216 725
| 91581
231120 817075
----------------------------------------------
. table color labor,
contents(sum uwcount sum weight sum wtcount) by(sex)
----------------------------------------------
sex and | labor
color | unemployed part-time other
----------+-----------------------------------
male |
white |
3511 4227 31467
| 1431 1408 1408
| 5024241
5951616 4.43e+07
|
black |
604 356 2245
| 1921 1849 1764
| 1160284
658244 3960180
|
other |
165 157 924
| 1029 1124
1228
| 169785
176468 1134672
----------+-----------------------------------
female |
white |
2281 7833 18945
| 1394 1373 1405
| 3179714
1.08e+07 2.66e+07
|
black |
545 563 2132
| 1705 1627 1668
| 929225
916001 3556176
|
other |
89 216 725
| 1029
1070 1127
| 91581
231120 817075
----------------------------------------------
. table color, contents (mean
weight)
------------------------
color | mean(weight)
----------+-------------
white | 1403.17
black | 1755.67
other | 1101.17
------------------------
. *The weights are not uniform.
. *uniform weights would have no effect on the model.
. *The reason that the CPS has weights is to reflect the fact that some populations have lower
rates
of response to the survey than other populations.
. *The weight that we're using here is something like the inverse of the sampling frequency.
Blacks are sampled at a lower rate because they're less likely to respond to the survey,
so those Blacks that do respond
to the CPS get a higher weight in the CPS.
. desmat: poisson uwcount
labor*sex labor*color sex*color, desmat(zval)
option desmat() not allowed
r(198);
. desmat: poisson uwcount
labor*sex labor*color sex*color, desrep(zval)
------------------------------------------------------------------------------------------
Poisson
regression
------------------------------------------------------------------------------------------
Dependent
variable uwcount
Optimization:
ml
Number of
observations:
18
Initial log likelihood:
-81627.074
Log
likelihood:
-123.390
LR chi
square:
163007.367
Model degrees of
freedom:
13
Pseudo
R-squared:
0.998
Prob:
0.000
------------------------------------------------------------------------------------------
nr Effect
Coeff s.e. z
------------------------------------------------------------------------------------------
uwcount
labor
1
part-time
0.210** 0.022 9.583
2 other
2.182** 0.017 127.658
sex
3 female
-0.446** 0.025 -18.192
labor.sex
4
part-time.female
1.017** 0.030 33.621
5
other.female -0.049 0.026
-1.891
color
6 black
-1.771** 0.035 -51.143
7 other
-3.178** 0.067 -47.667
labor.color
8 part-time.black
-1.043** 0.048 -21.928
9
part-time.other
-0.380** 0.084 -4.550
10
other.black
-0.822** 0.036 -22.834
11
other.other
-0.292** 0.069 -4.238
sex.color
12
female.black
0.354** 0.027 13.279
13
female.other 0.126** 0.044
2.891
14 _cons
8.170** 0.016 502.506
------------------------------------------------------------------------------------------
*
p < .05
** p < .01
. poisgof
Goodness-of-fit chi2
= 86.53056
Prob > chi2(4) = 0.0000
. poisgof, pearson
Goodness-of-fit chi2
= 89.79915
Prob > chi2(4) = 0.0000
. *By summary statistics, you can tell this is C+E's number 1. But the Z values don't correspond to their reported Z values.
That's because
they use deviation coding and exclude the highest group
. desmat: poisson uwcount
labor*sex labor*color sex*color, defcon(dev(3)) desrep(zval)
------------------------------------------------------------------------------------------
Poisson
regression
------------------------------------------------------------------------------------------
Dependent
variable
uwcount
Optimization:
ml
Number of
observations:
18
Initial log
likelihood: -81627.074
Log
likelihood:
-123.390
LR chi
square:
163007.367
Model degrees of
freedom:
13
Pseudo
R-squared:
0.998
Prob:
0.000
------------------------------------------------------------------------------------------
nr Effect
Coeff s.e. z
------------------------------------------------------------------------------------------
uwcount
labor
1
unemployed
-0.677** 0.018 -38.575
2
part-time
-0.433** 0.017 -26.205
sex
3 male -0.018* 0.009
-2.004
labor.sex
4
unemployed.male
0.161** 0.009 18.482
5
part-time.male
-0.347** 0.007 -46.843
color
6 white
1.852** 0.011 162.348
7 black
-0.364** 0.014 -25.659
labor.color
8
unemployed.white
-0.282** 0.018
-15.386
9
unemployed.black
0.340** 0.022 15.425
10
part-time.white
0.193** 0.017 11.300
11
part-time.black -0.229** 0.022
-10.465
sex.color
12
male.white
0.080** 0.009 9.162
13
male.black
-0.097** 0.011 -8.679
14 _cons
7.053** 0.011 640.281
------------------------------------------------------------------------------------------
*
p < .05
** p < .01
. *It's the same model, but with the dummy variable
coding that is the same as the coding
> that C+E use. That's deviation coding, see the desmat help
file.
. poisgof
Goodness-of-fit chi2
= 86.53056
Prob > chi2(4) = 0.0000
. poisgof, pearson.
option pearson. not allowed
r(198);
. poisgof, pearson
Goodness-of-fit chi2
= 89.79915
Prob > chi2(4) = 0.0000
. *Same model, #1
. *Number two, which is not reported in Clogg and
Eliason, is to inflate the count by the
> weight and use the new huge weighted counts as
dependent variable
. desmat: poisson wtcount
labor*sex labor*color sex*color, defcon(dev(3)) desrep(zval)
------------------------------------------------------------------------------------------
Poisson regression
------------------------------------------------------------------------------------------
Dependent
variable
wtcount
Optimization: ml
Number of
observations:
18
Initial log
likelihood:
-1.137e+08
Log
likelihood: -71806.645
LR chi
square:
2.273e+08
Model degrees of
freedom:
13
Pseudo
R-squared: 0.999
Prob:
0.000
------------------------------------------------------------------------------------------
nr Effect
Coeff s.e. z
------------------------------------------------------------------------------------------
wtcount
labor
1
unemployed -0.689**
0.001 -1329.498
2
part-time
-0.435** 0.000 -915.000
sex
3 male
0.013** 0.000 50.973
labor.sex
4 unemployed.male
0.165** 0.000 717.857
5
part-time.male
-0.340** 0.000 -1735.469
color
6 white
1.858** 0.000 5625.057
7 black
-0.133** 0.000 -345.910
labor.color
8
unemployed.white
-0.263** 0.001 -490.386
9
unemployed.black
0.383** 0.001
631.891
10
part-time.white
0.186** 0.000 380.579
11
part-time.black
-0.231** 0.001 -394.068
sex.color
12 male.white
0.059** 0.000 240.222
13
male.black
-0.083** 0.000 -280.576
14 _cons
14.294** 0.000 44573.248
------------------------------------------------------------------------------------------
*
p < .05
** p < .01
. poisgof
Goodness-of-fit chi2
= 143357.3
Prob > chi2(4) = 0.0000
. poisgof, pearson
Goodness-of-fit chi2 =
148750.5
Prob > chi2(4) = 0.0000
. *Use of the weighted counts (here the weights are huge)
does 2 things.
. *It shrinks the SE to miniscule values
. *It inflates the GOF test way out of town.
* use of weights in this way is
the most wrong.
. desmat, sigcut (0.001
0.0000001) sigsym(*** *&*)
varlist not allowed
r(101);
. desrep, sigcut (0.001
0.0000001) sigsym(*** *&*)
------------------------------------------------------------------------------------------
Poisson
regression
------------------------------------------------------------------------------------------
Dependent
variable
wtcount
Optimization:
ml
Number of
observations:
18
Initial log
likelihood:
-1.137e+08
Log
likelihood:
-71806.645
LR chi
square:
2.273e+08
Model degrees of
freedom:
13
Pseudo
R-squared:
0.999
Prob:
0.000
------------------------------------------------------------------------------------------
nr Effect Coeff s.e.
------------------------------------------------------------------------------------------
wtcount
1 _x_1
-0.689*&* 0.001
2 _x_2
-0.435*&* 0.000
3 _x_3
0.013*&* 0.000
4 _x_4
0.165*&* 0.000
5 _x_5
-0.340*&* 0.000
6 _x_6
1.858*&* 0.000
7 _x_7
-0.133*&* 0.000
8 _x_8
-0.263*&* 0.001
9 _x_9
0.383*&* 0.001
10 _x_10 0.186*&* 0.000
11 _x_11
-0.231*&* 0.001
12 _x_12
0.059*&* 0.000
13 _x_13 -0.083*&* 0.000
14 _cons
14.294*&* 0.000
------------------------------------------------------------------------------------------
*** p < .001
*&* p < 1.00000000e-07
. *You can do whatever you want with the cutoffs and
symbols using desrep
. desrep, zval
------------------------------------------------------------------------------------------
Poisson
regression
------------------------------------------------------------------------------------------
Dependent
variable
wtcount
Optimization:
ml
Number of observations:
18
Initial log
likelihood:
-1.137e+08
Log
likelihood:
-71806.645
LR chi
square:
2.273e+08
Model degrees of
freedom:
13
Pseudo
R-squared:
0.999
Prob:
0.000
------------------------------------------------------------------------------------------
nr Effect Coeff s.e. z
------------------------------------------------------------------------------------------
wtcount
1 _x_1
-0.689** 0.001 -1329.498
2 _x_2 -0.435** 0.000 -915.000
3 _x_3
0.013** 0.000 50.973
4 _x_4
0.165** 0.000 717.857
5 _x_5
-0.340** 0.000 -1735.469
6 _x_6
1.858** 0.000 5625.057
7 _x_7
-0.133** 0.000 -345.910
8 _x_8
-0.263** 0.001 -490.386
9 _x_9
0.383** 0.001 631.891
10 _x_10 0.186** 0.000
380.579
11 _x_11
-0.231** 0.001 -394.068
12 _x_12
0.059** 0.000 240.222
13 _x_13 -0.083** 0.000 -280.576
14 _cons
14.294** 0.000 44573.248
------------------------------------------------------------------------------------------
*
p < .05
** p < .01
. *The model that uses wtcount as the dependent variable
actually has the correct coefficients
. desmat: poisson wtcount
labor*sex labor*color sex*color, defcon(dev(3)) desrep(zval)
------------------------------------------------------------------------------------------
Poisson
regression
------------------------------------------------------------------------------------------
Dependent
variable
wtcount
Optimization:
ml
Number of
observations:
18
Initial log
likelihood:
-1.137e+08
Log
likelihood:
-71806.645
LR chi
square:
2.273e+08
Model degrees of
freedom: 13
Pseudo
R-squared:
0.999
Prob:
0.000
------------------------------------------------------------------------------------------
nr Effect
Coeff s.e. z
------------------------------------------------------------------------------------------
wtcount
labor
1 unemployed
-0.689** 0.001 -1329.498
2
part-time
-0.435** 0.000 -915.000
sex
3 male
0.013** 0.000 50.973
labor.sex
4
unemployed.male
0.165** 0.000 717.857
5
part-time.male
-0.340** 0.000 -1735.469
color
6 white 1.858** 0.000 5625.057
7 black
-0.133** 0.000 -345.910
labor.color
8
unemployed.white
-0.263** 0.001 -490.386
9
unemployed.black
0.383** 0.001 631.891
10
part-time.white
0.186** 0.000 380.579
11
part-time.black
-0.231** 0.001 -394.068
sex.color
12
male.white
0.059** 0.000 240.222
13
male.black
-0.083** 0.000 -280.576
14 _cons 14.294** 0.000 44573.248
------------------------------------------------------------------------------------------
*
p < .05
** p < .01
. *If you look at this disastrous model, the coefficients
are actually correct, compare to the final column of C+E table 6
. *That might lead a person to simply rescale the
weights, and apply the rescaled weights*
>
uwcount as the new depvar.
That would be Clogg+ Eliason # 2
. desmat: poisson wt_count_rescale labor*sex labor*color
sex*color, defcon(dev(3)) desrep
> (zval)
------------------------------------------------------------------------------------------
Poisson
regression
------------------------------------------------------------------------------------------
Dependent
variable
wt_count_rescale
Optimization:
ml
Number of
observations:
18
Initial log
likelihood:
-80153.975
Log
likelihood:
-130.414
LR chi
square: 160047.123
Model degrees of
freedom:
13
Pseudo
R-squared:
0.998
Prob: 0.000
------------------------------------------------------------------------------------------
nr Effect
Coeff s.e. z
------------------------------------------------------------------------------------------
wt_count_rescale
labor
1
unemployed
-0.689** 0.020 -35.281
2
part-time
-0.435** 0.018 -24.282
sex
3 male
0.013 0.010 1.353
labor.sex
4
unemployed.male
0.165** 0.009 19.050
5
part-time.male -0.340** 0.007
-46.055
color
6 white
1.858** 0.012 149.274
7 black
-0.133** 0.015 -9.180
labor.color
8
unemployed.white
-0.263** 0.020 -13.013
9
unemployed.black
0.383** 0.023 16.769
10
part-time.white
0.186** 0.018
10.100
11
part-time.black
-0.231** 0.022 -10.457
sex.color
12
male.white
0.059** 0.009 6.375
13 male.black
-0.083** 0.011 -7.446
14 _cons
7.036** 0.012 582.209
------------------------------------------------------------------------------------------
*
p < .05
** p < .01
. *Okay. This
model has the correct coefficients, because it takes the weights into accou
> nt. And it has standard errors that are not
crazy, because the scale of the dataset ref
> lects the actual scale of
unweighted counts.
. *But the SE are still not
quite right.
. poisgof
Goodness-of-fit chi2
= 100.9525
Prob > chi2(4) = 0.0000
. poisgof, pearson
Goodness-of-fit chi2
= 104.7538
Prob > chi2(4) = 0.0000
. *The way Stata takes the weights into account is
through an option called exposure.
. desmat: poisson uwcount labor*sex labor*color sex*color,
exposure (invweight) defcon(de
> v(3)) desrep(zval)
------------------------------------------------------------------------------------------
Poisson
regression
------------------------------------------------------------------------------------------
Dependent
variable
uwcount
Optimization:
ml
Number of
observations:
18
Initial log
likelihood:
-84619.027
Log
likelihood:
-124.919
LR chi
square:
168988.216
Model degrees of
freedom:
13
Pseudo
R-squared:
0.999
Prob:
0.000
------------------------------------------------------------------------------------------
nr Effect
Coeff s.e. z
------------------------------------------------------------------------------------------
uwcount
labor
1 unemployed
-0.688** 0.018 -39.155
2
part-time
-0.440** 0.017 -26.614
sex
3 male
0.013 0.009
1.381
labor.sex
4
unemployed.male
0.166** 0.009 18.987
5
part-time.male
-0.343** 0.007 -46.302
color
6 white 1.860** 0.011
163.037
7 black
-0.136** 0.014 -9.590
labor.color
8
unemployed.white
-0.265** 0.018 -14.436
9
unemployed.black
0.386** 0.022 17.495
10
part-time.white
0.191** 0.017 11.175
11
part-time.black -0.238** 0.022
-10.861
sex.color
12
male.white
0.058** 0.009 6.705
13
male.black
-0.085** 0.011 -7.611
14 _cons 14.292** 0.011 1296.842
ln(invweight)
(offset)
------------------------------------------------------------------------------------------
*
p < .05
** p < .01
. poisgof
Goodness-of-fit chi2
= 89.58815
Prob > chi2(4) = 0.0000
. poisgof, pearson
Goodness-of-fit chi2
= 93.54537
Prob > chi2(4) = 0.0000
. *Here what you have is a model that has the
coefficients of models 2 and 3, but the standard errors of model # 1
. exit, clear