log type: text
opened on: 12 Oct 2005, 10:58:52
. *first let me show you ID and BIC
. *For our simple dataset A first
. edit
(3 vars, 4 obs pasted into editor)
- preserve
. desmat: poisson count race occ
------------------------------------------------------------------------------------------
Poisson regression
------------------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 4
Initial log likelihood: -26656.550
Log likelihood: -59.074
LR chi square: 53194.953
Model degrees of freedom: 2
Pseudo R-squared: 0.998
Prob: 0.000
------------------------------------------------------------------------------------------
nr Effect Coeff s.e.
------------------------------------------------------------------------------------------
count
race
1 w 1.829** 0.011
occ
2 WC -0.921** 0.008
3 _cons 8.825** 0.011
------------------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 73.77235
Prob > chi2(1) = 0.0000
. set linesize 79
. tabulate race occ [fweight=count]
| occ
race | Oth WC | Total
-----------+----------------------+----------
n | 7,146 2,361 | 9,507
w | 42,012 17,216 | 59,228
-----------+----------------------+----------
Total | 49,158 19,577 | 68,735
. *BIC= GOF-dfln(N)
. display 73.77-(1*(ln(68735)))
62.631986
. *positive BIC means the model is rejected in comparsion to the saturated model, whereas negative BIC is preferred to the saturated model
. predict A_indep
(option n assumed; predicted number of events)
. generate ID_parts= 50*(abs((A_indep/68735)-(count/68735)))
. table race occ, contents(sum ID_parts) row col
----------------------------------------
| occ
race | Oth WC Total
----------+-----------------------------
n | .2522511 .2522511 .5045021
w | .2522511 .2522511 .5045021
|
Total | .5045021 .5045021 1.009004
----------------------------------------
. *The ID for this model is the total of the statistic summed over all cells, which I get by generating the cell statistic and then tabling, summing, and looking at the total, which is 1.009
. *just to briefly demonstrate what I mean by BIC and ID, and how to calculate them.
. *The interpretation of ID is percentage of the actual data that would have to
move to fit the model, or vice versa. Smaller numbers mean a smaller percentage needs to move, that means better fit. In this case roughly 1% of the dataset would have to move from one cell to another in order to transform the actual data into the data under the hypothesis of independence, or vice versa. In other words the same 1% would move from the expected values under independence to create the actual data. 1% may not seem like a lot, but in this case that 1% is not only statistically significant, but rather substantial when applied to the whole US labor market of 100 million persons.
. clear all
. use "C:\AAA Miker Files\newer web pages\soc_388_notes\ed intermar.dta", clear
. *reminder, now back to the educational intermarriage dataset.
. desmat: poisgof count hed wed endog
varlist not allowed
r(101);
. desmat: poisson count hed wed endog
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 16
Initial log likelihood: -221501.223
Log likelihood: -24059.274
LR chi square: 394883.898
Model degrees of freedom: 10
Pseudo R-squared: 0.891
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
hed
1 2 1.134** 0.007
2 3 0.819** 0.006
3 4 -0.017* 0.007
wed
4 2 1.372** 0.007
5 3 1.020** 0.007
6 4 -0.278** 0.008
endog
7 1 1.722** 0.009
8 2 0.676** 0.007
9 3 0.537** 0.008
10 4 2.487** 0.009
11 _cons 8.652** 0.008
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 47932.55
Prob > chi2(5) = 0.0000
. *in order to fit the data better, we had to account for some of the off-diagonal interactions.
. desmat: poisson count hed wed endog eddiff3 eddiff2
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 16
Initial log likelihood: -221501.223
Log likelihood: -145.628
LR chi square: 442711.189
Model degrees of freedom: 12
Pseudo R-squared: 0.999
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
hed
1 2 0.627** 0.008
2 3 0.355** 0.007
3 4 0.180** 0.008
wed
4 2 0.817** 0.008
5 3 0.461** 0.007
6 4 -0.142** 0.009
endog
7 1 0.763** 0.011
8 2 0.779** 0.007
9 3 0.601** 0.008
10 4 1.195** 0.011
eddiff3
11 1 -2.749** 0.024
eddiff2
12 1 -1.068** 0.006
13 _cons 9.611** 0.009
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 105.2568
Prob > chi2(3) = 0.0000
. *What my examination of the pearson residuals showed, was that the two most extreme cells in terms of educational difference were fit poorly
. desmat: poisson count hed wed endog eddiff3 eddiff2 eddiff3m
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 16
Initial log likelihood: -221501.223
Log likelihood: -117.905
LR chi square: 442766.636
Model degrees of freedom: 13
Pseudo R-squared: 0.999
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
hed
1 2 0.630** 0.008
2 3 0.360** 0.007
3 4 0.188** 0.008
wed
4 2 0.813** 0.008
5 3 0.456** 0.007
6 4 -0.153** 0.009
endog
7 1 0.762** 0.011
8 2 0.779** 0.007
9 3 0.601** 0.008
10 4 1.197** 0.011
eddiff3
11 1 -2.563** 0.033
eddiff2
12 1 -1.068** 0.006
eddiff3m
13 1 -0.346** 0.046
14 _cons 9.612** 0.009
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 49.81037
Prob > chi2(2) = 0.0000
. log close
log type: text
closed on: 12 Oct 2005, 11:55:29
-------------------------------------------------------------------------------