opened on: 26 Oct 2005, 10:58:46
. edit
(4 vars, 8 obs pasted into editor)
- preserve
. *Agresti's death penalty data
. table victim defendant [fweight=count], contents (freq mean death_penalty) row col
-------------------------------------
| defendant
victim | black white Total
----------+--------------------------
black | 103 9 112
| .058252 0 .053571
|
white | 63 151 214
| .174603 .125828 .140187
|
Total | 166 160 326
| .10241 .11875 .110429
-------------------------------------
*notice that in the data overall, 11.8% of white defendants got the death penalty compared to
10.2% of black defendants. It is a small difference and it is in the opposite direction from what we expect.
. poisson count
Iteration 0: log likelihood = -215.79845
Iteration 1: log likelihood = -215.79845
Poisson regression Number of obs = 8
LR chi2(0) = 0.00
Prob > chi2 = .
Log likelihood = -215.79845 Pseudo R2 = 0.0000
------------------------------------------------------------------------------
count | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_cons | 3.707456 .0553849 66.94 0.000 3.598903 3.816008
------------------------------------------------------------------------------
. poisgof
Goodness-of-fit chi2 = 395.9153
Prob > chi2(7) = 0.0000
. *The constant only model (were it to fit the data well) means that none of the variables matter at all, and so we're not surprised that this model doesn't fit.
. set linesize 75
. display exp(3.7075)
40.7518
. *The first interesting model is the mutually independent model
. desmat: poisson count defendant victim death_penalty
---------------------------------------------------------------------------
Poisson regression
---------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 8
Initial log likelihood: -215.798
Log likelihood: -86.805
LR chi square: 257.986
Model degrees of freedom: 3
Pseudo R-squared: 0.598
Prob: 0.000
---------------------------------------------------------------------------
nr Effect Coeff s.e.
---------------------------------------------------------------------------
count
defendant
1 white -0.037 0.111
victim
2 white 0.647** 0.117
death_penalty
3 1 -2.086** 0.177
4 _cons 3.927** 0.111
---------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 137.9293
Prob > chi2(4) = 0.0000
. *this model also doesn't fit, because the 3 variables are not mutually in
> dependent.
. *let me go back for just a second, to show you a loglinear model and a logistic model that are the same.
. desmat: poisson count death_penalty
---------------------------------------------------------------------------
Poisson regression
---------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 8
Initial log likelihood: -215.798
Log likelihood: -103.089
LR chi square: 225.419
Model degrees of freedom: 1
Pseudo R-squared: 0.522
Prob: 0.000
---------------------------------------------------------------------------
nr Effect Coeff s.e.
---------------------------------------------------------------------------
count
death_penalty
1 1 -2.086** 0.177
2 _cons 4.284** 0.059
---------------------------------------------------------------------------
* p < .05
** p < .01
. set linesize 79
. logistic death_penalty [fweight=count], coef
Logistic regression Number of obs = 326
LR chi2(0) = 0.00
Prob > chi2 = .
Log likelihood = -113.2564 Pseudo R2 = 0.0000
------------------------------------------------------------------------------
death_pena~y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_cons | -2.086362 .176709 -11.81 0.000 -2.432705 -1.740019
------------------------------------------------------------------------------
. *even though the logistic and loglinear models here are equivalent (the same coefficient and standard error), the log likelihoods are different, why?
. lfit, table
Logistic model for death_penalty, goodness-of-fit test
+--------------------------------------------------------+
| Group | Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |
|-------+--------+-------+-------+-------+-------+-------|
| 1 | 0.1104 | 36 | 36.0 | 290 | 290.0 | 326 |
+--------------------------------------------------------+
+----------------+
| Group | Prob |
|-------+--------|
| 1 | 0.1104 |
+----------------+
number of observations = 326
number of covariate patterns = 1
Pearson chi2(0) = 0.00
Prob > chi2 = .
. *see here how the logistic regression model collapsed out 8 cells of data into 2 cells, and that is why the fit statistics are different.
. desmat: poisson count defendant*death_penalty
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 8
Initial log likelihood: -215.798
Log likelihood: -102.923
LR chi square: 225.751
Model degrees of freedom: 3
Pseudo R-squared: 0.523
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
defendant
1 white -0.055 0.117
death_penalty
2 1 -2.171** 0.256
defendant.death_penalty
3 white.1 0.166 0.354
4 _cons 4.311** 0.082
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 170.1642
Prob > chi2(4) = 0.0000
. *So it doesn't seem, at first blush, that there is a strong relationship between defendant's race and getting the death penalty (and whites seem slightly more likely to get the death penalty), which is exactly what
the simple descriptive statistics showed. Always start analyses with the simple descriptive stats.
. desmat: poisson count defendant*death_penalty victim
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 8
Initial log likelihood: -215.798
Log likelihood: -86.695
LR chi square: 258.207
Model degrees of freedom: 4
Pseudo R-squared: 0.598
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
defendant
1 white -0.055 0.117
death_penalty
2 1 -2.171** 0.256
defendant.death_penalty
3 white.1 0.166 0.354
victim
4 white 0.647** 0.117
5 _cons 3.936** 0.112
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 137.7078
Prob > chi2(3) = 0.0000
. *Still, defendant* death penalty is insignificant.
. desmat: poisson count defendant*death_penalty victim*defendant
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 8
Initial log likelihood: -215.798
Log likelihood: -21.796
LR chi square: 388.005
Model degrees of freedom: 5
Pseudo R-squared: 0.899
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
defendant
1 white -2.456** 0.350
death_penalty
2 1 -2.171** 0.256
defendant.death_penalty
3 white.1 0.166 0.354
victim
4 white -0.492** 0.160
victim.defendant
5 white.white 3.312** 0.379
6 _cons 4.527** 0.102
-------------------------------------------------------------------------------
* p < .05
** p < .01
. *There is a strong association between race of victim and race of defendant
. poisgof
Goodness-of-fit chi2 = 7.910102
Prob > chi2(2) = 0.0192
. desmat: poisson count death_penalty victim*defendant
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 8
Initial log likelihood: -215.798
Log likelihood: -21.907
LR chi square: 387.784
Model degrees of freedom: 4
Pseudo R-squared: 0.898
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
death_penalty
1 1 -2.086** 0.177
victim
2 white -0.492** 0.160
defendant
3 white -2.438** 0.348
victim.defendant
4 white.white 3.312** 0.379
5 _cons 4.518** 0.100
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 8.131552
Prob > chi2(3) = 0.0434
. *The fact that this model fits the data reasonably well, and includes only the interaction between defendant's race and victim's race, and says nothing about the associations between race (either of defendant or victim) and death penalty, means that we can consider the possibility that race doesn't play any role in who gets the death penalty, at least based on this data.
. desmat: poisson count victim*defendant victim*death_penalty
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 8
Initial log likelihood: -215.798
Log likelihood: -18.782
LR chi square: 394.033
Model degrees of freedom: 5
Pseudo R-squared: 0.913
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
victim
1 white -0.588** 0.164
defendant
2 white -2.438** 0.348
victim.defendant
3 white.white 3.312** 0.379
death_penalty
4 1 -2.872** 0.420
victim.death_penalty
5 white.1 1.058* 0.464
6 _cons 4.580** 0.101
-------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof
Goodness-of-fit chi2 = 1.881837
Prob > chi2(2) = 0.3903
. *This model is preferred by the LRT because it improves the goodness of fit by about 6 on 1 df,
. display chi2tail(1,6.3)
.0120738
. *So from a goodness of fit perspective these last two models both have something to recommend them.
. *neither of these last two models, however, can be replicated by logistic regression.
. *Let's take a look at the saturated models from both logistic and loglinear,
> because that's the only logistic model that would account for D*V
. desmat: poisson count victim*defendant*death_penalty
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count
Optimization: ml
Number of observations: 8
Initial log likelihood: -215.798
Log likelihood: -17.841
LR chi square: 395.915
Model degrees of freedom: 7
Pseudo R-squared: 0.917
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count
victim
1 white -0.623** 0.172
defendant
2 white -2.377** 0.348
victim.defendant
3 white.white 3.309** 0.385
death_penalty
4 1 -2.783** 0.421
victim.death_penalty
5 white.1 1.230* 0.536
defendant.death_penalty
6 white.1 -14.711 2096.899
victim.defendant.death_penalty
7 white.white.1 14.326 2096.899
8 _cons 4.575** 0.102
-------------------------------------------------------------------------------
* p < .05
** p < .01
. *Here is where the zero in our data starts to choke us a bit.
. *one way around this problem is simply to add a constant to every cell
. *in any event, even if it seems a bit odd, we can actually add a constant to every cell
. gen count_plus1=count+1
. desmat: poisson count_plus1 victim*defendant*death_penalty
-------------------------------------------------------------------------------
Poisson regression
-------------------------------------------------------------------------------
Dependent variable count_plus1
Optimization: ml
Number of observations: 8
Initial log likelihood: -209.216
Log likelihood: -19.054
LR chi square: 380.324
Model degrees of freedom: 7
Pseudo R-squared: 0.909
Prob: 0.000
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
count_plus1
victim
1 white -0.615** 0.171
defendant
2 white -2.282** 0.332
victim.defendant
3 white.white 3.202** 0.370
death_penalty
4 1 -2.639** 0.391
victim.death_penalty
5 white.1 1.154* 0.505
defendant.death_penalty
6 white.1 0.336 1.119
victim.defendant.death_penalty
7 white.white.1 -0.746 1.189
8 _cons 4.585** 0.101
-------------------------------------------------------------------------------
* p < .05
** p < .01
. *replacing the count by count plus one makes the model look a lot nicer, and doesn't change any of the substantive intepretations.
. desmat: logistic death_penalty victim*defendant [fweight= count_plus1], coef
-------------------------------------------------------------------------------
logistic
-------------------------------------------------------------------------------
Dependent variable death_penalty
Number of observations: 334
fweight: count_plus1
Initial log likelihood: -122.393
Log likelihood: -119.485
LR chi square: 5.816
Model degrees of freedom: 3
Pseudo R-squared: 0.024
Prob: 0.121
-------------------------------------------------------------------------------
nr Effect Coeff s.e.
-------------------------------------------------------------------------------
victim
1 white 1.154* 0.505
defendant
2 white 0.336 1.119
victim.defendant
3 white.white -0.746 1.189
4 _cons -2.639** 0.391
-------------------------------------------------------------------------------
* p < .05
** p < .01
. lfit, table
Logistic model for death_penalty, goodness-of-fit test
+--------------------------------------------------------+
| Group | Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |
|-------+--------+-------+-------+-------+-------+-------|
| 1 | 0.0667 | 7 | 7.0 | 98 | 98.0 | 105 |
| 2 | 0.0909 | 1 | 1.0 | 10 | 10.0 | 11 |
| 3 | 0.1307 | 20 | 20.0 | 133 | 133.0 | 153 |
| 4 | 0.1846 | 12 | 12.0 | 53 | 53.0 | 65 |
+--------------------------------------------------------+
+-------------------------------------+
| Group | Prob | _x_1 | _x_2 | _x_3 |
|-------+--------+------+------+------|
| 1 | 0.0667 | 0 | 0 | 0 |
| 2 | 0.0909 | 0 | 1 | 0 |
| 3 | 0.1307 | 1 | 1 | 1 |
| 4 | 0.1846 | 1 | 0 | 0 |
+-------------------------------------+
number of observations = 334
number of covariate patterns = 4
Pearson chi2(0) = 0.00
Prob > chi2 = .
. exit, clear