opened on:  26 Oct 2005, 10:58:46

 

. edit

(4 vars, 8 obs pasted into editor)

- preserve

 

. *Agresti's death penalty data

. table  victim defendant [fweight=count], contents (freq mean death_penalty) row col

 

-------------------------------------

          |         defendant       

   victim |   black    white    Total

----------+--------------------------

    black |     103        9      112

          | .058252        0  .053571

          |

    white |      63      151      214

          | .174603  .125828  .140187

          |

    Total |     166      160      326

          |  .10241   .11875  .110429

-------------------------------------

 

 

*notice that in the data overall, 11.8% of white defendants got the death penalty compared to

10.2% of black defendants. It is a small difference and it is in the opposite direction from what we expect.

. poisson count

 

Iteration 0:   log likelihood = -215.79845 

Iteration 1:   log likelihood = -215.79845 

 

Poisson regression                                Number of obs   =          8

                                                  LR chi2(0)      =       0.00

                                                  Prob > chi2     =          .

Log likelihood = -215.79845                       Pseudo R2       =     0.0000

 

------------------------------------------------------------------------------

       count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

       _cons |   3.707456   .0553849    66.94   0.000     3.598903    3.816008

------------------------------------------------------------------------------

 

. poisgof

 

         Goodness-of-fit chi2  =  395.9153

         Prob > chi2(7)        =    0.0000

 

. *The constant only model (were it to fit the data well) means that none of the variables matter at all, and so we're not surprised that this model doesn't fit.

. set linesize 75

 

. display exp(3.7075)

40.7518

 

. *The first interesting model is the mutually independent model

. desmat: poisson count  defendant victim death_penalty

---------------------------------------------------------------------------

   Poisson regression

---------------------------------------------------------------------------

   Dependent variable                                                count

   Optimization:                                                        ml

   Number of observations:                                               8

   Initial log likelihood:                                        -215.798

   Log likelihood:                                                 -86.805

   LR chi square:                                                  257.986

   Model degrees of freedom:                                             3

   Pseudo R-squared:                                                 0.598

   Prob:                                                             0.000

---------------------------------------------------------------------------

nr Effect                                                Coeff        s.e.

---------------------------------------------------------------------------

   count

     defendant

1      white                                            -0.037       0.111

     victim

2      white                                             0.647**     0.117

     death_penalty

3      1                                                -2.086**     0.177

4    _cons                                               3.927**     0.111

---------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  137.9293

         Prob > chi2(4)        =    0.0000

 

. *this model also doesn't fit, because the 3 variables are not mutually in

> dependent.

. *let me go back for just a second, to show you a loglinear model and a logistic model that are the same.

. desmat: poisson count  death_penalty

---------------------------------------------------------------------------

   Poisson regression

---------------------------------------------------------------------------

   Dependent variable                                                count

   Optimization:                                                        ml

   Number of observations:                                               8

   Initial log likelihood:                                        -215.798

   Log likelihood:                                                -103.089

   LR chi square:                                                  225.419

   Model degrees of freedom:                                             1

   Pseudo R-squared:                                                 0.522

   Prob:                                                             0.000

---------------------------------------------------------------------------

nr Effect                                                Coeff        s.e.

---------------------------------------------------------------------------

   count

     death_penalty

1      1                                                -2.086**     0.177

2    _cons                                               4.284**     0.059

---------------------------------------------------------------------------

*  p < .05

** p < .01

 

. set linesize 79

 

. logistic  death_penalty [fweight=count], coef

 

Logistic regression                               Number of obs   =        326

                                                  LR chi2(0)      =       0.00

                                                  Prob > chi2     =          .

Log likelihood =  -113.2564                       Pseudo R2       =     0.0000

 

------------------------------------------------------------------------------

death_pena~y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

       _cons |  -2.086362    .176709   -11.81   0.000    -2.432705   -1.740019

------------------------------------------------------------------------------

 

. *even though the logistic and loglinear models here are equivalent (the same coefficient and standard error), the log likelihoods are different, why?

. lfit, table

 

Logistic model for death_penalty, goodness-of-fit test

 

  +--------------------------------------------------------+

  | Group |   Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |

  |-------+--------+-------+-------+-------+-------+-------|

  |     1 | 0.1104 |    36 |  36.0 |   290 | 290.0 |   326 |

  +--------------------------------------------------------+

 

  +----------------+

  | Group |   Prob |

  |-------+--------|

  |     1 | 0.1104 |

  +----------------+

 

       number of observations =       326

 number of covariate patterns =         1

              Pearson chi2(0) =         0.00

                  Prob > chi2 =              .

 

. *see here how the logistic regression model collapsed out 8 cells of data into 2 cells, and that is why the fit statistics are different.

. desmat: poisson count defendant*death_penalty

-------------------------------------------------------------------------------

   Poisson regression

-------------------------------------------------------------------------------

   Dependent variable                                                    count

   Optimization:                                                            ml

   Number of observations:                                                   8

   Initial log likelihood:                                            -215.798

   Log likelihood:                                                    -102.923

   LR chi square:                                                      225.751

   Model degrees of freedom:                                                 3

   Pseudo R-squared:                                                     0.523

   Prob:                                                                 0.000

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   count

     defendant

1      white                                                -0.055       0.117

     death_penalty

2      1                                                    -2.171**     0.256

     defendant.death_penalty

3      white.1                                               0.166       0.354

4    _cons                                                   4.311**     0.082

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  170.1642

         Prob > chi2(4)        =    0.0000

 

. *So it doesn't seem, at first blush, that there is a strong relationship between defendant's race and getting the death penalty (and whites seem slightly more likely to get the death penalty), which is exactly what

the simple descriptive statistics showed. Always start analyses with the simple descriptive stats.

. desmat: poisson count defendant*death_penalty  victim

-------------------------------------------------------------------------------

   Poisson regression

-------------------------------------------------------------------------------

   Dependent variable                                                    count

   Optimization:                                                            ml

   Number of observations:                                                   8

   Initial log likelihood:                                            -215.798

   Log likelihood:                                                     -86.695

   LR chi square:                                                      258.207

   Model degrees of freedom:                                                 4

   Pseudo R-squared:                                                     0.598

   Prob:                                                                 0.000

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   count

     defendant

1      white                                                -0.055       0.117

     death_penalty

2      1                                                    -2.171**     0.256

     defendant.death_penalty

3      white.1                                               0.166       0.354

     victim

4      white                                                 0.647**     0.117

5    _cons                                                   3.936**     0.112

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  137.7078

         Prob > chi2(3)        =    0.0000

 

. *Still, defendant* death penalty is insignificant.

. desmat: poisson count defendant*death_penalty  victim*defendant

-------------------------------------------------------------------------------

   Poisson regression

-------------------------------------------------------------------------------

   Dependent variable                                                    count

   Optimization:                                                            ml

   Number of observations:                                                   8

   Initial log likelihood:                                            -215.798

   Log likelihood:                                                     -21.796

   LR chi square:                                                      388.005

   Model degrees of freedom:                                                 5

   Pseudo R-squared:                                                     0.899

   Prob:                                                                 0.000

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   count

     defendant

1      white                                                -2.456**     0.350

     death_penalty

2      1                                                    -2.171**     0.256

     defendant.death_penalty

3      white.1                                               0.166       0.354

     victim

4      white                                                -0.492**     0.160

     victim.defendant

5      white.white                                           3.312**     0.379

6    _cons                                                   4.527**     0.102

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. *There is a strong association between race of victim and race of defendant

. poisgof

 

         Goodness-of-fit chi2  =  7.910102

         Prob > chi2(2)        =    0.0192

 

. desmat: poisson count death_penalty victim*defendant

-------------------------------------------------------------------------------

   Poisson regression

-------------------------------------------------------------------------------

   Dependent variable                                                    count

   Optimization:                                                            ml

   Number of observations:                                                   8

   Initial log likelihood:                                            -215.798

   Log likelihood:                                                     -21.907

   LR chi square:                                                      387.784

   Model degrees of freedom:                                                 4

   Pseudo R-squared:                                                     0.898

   Prob:                                                                 0.000

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   count

     death_penalty

1      1                                                    -2.086**     0.177

     victim

2      white                                                -0.492**     0.160

     defendant

3      white                                                -2.438**     0.348

     victim.defendant

4      white.white                                           3.312**     0.379

5    _cons                                                   4.518**     0.100

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  8.131552

         Prob > chi2(3)        =    0.0434

 

. *The fact that this model fits the data reasonably well, and includes only the interaction between defendant's race and victim's race, and says nothing about the associations between race (either of defendant or victim) and death penalty, means that we can consider the possibility that race doesn't play any role in who gets the death penalty, at least based on this data.

. desmat: poisson count victim*defendant victim*death_penalty

-------------------------------------------------------------------------------

   Poisson regression

-------------------------------------------------------------------------------

   Dependent variable                                                    count

   Optimization:                                                            ml

   Number of observations:                                                   8

   Initial log likelihood:                                            -215.798

   Log likelihood:                                                     -18.782

   LR chi square:                                                      394.033

   Model degrees of freedom:                                                 5

   Pseudo R-squared:                                                     0.913

   Prob:                                                                 0.000

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   count

     victim

1      white                                                -0.588**     0.164

     defendant

2      white                                                -2.438**     0.348

     victim.defendant

3      white.white                                           3.312**     0.379

     death_penalty

4      1                                                    -2.872**     0.420

     victim.death_penalty

5      white.1                                               1.058*      0.464

6    _cons                                                   4.580**     0.101

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  1.881837

         Prob > chi2(2)        =    0.3903

 

. *This model is preferred by the LRT because it improves the goodness of fit by about 6 on 1 df,

. display chi2tail(1,6.3)

.0120738

 

. *So from a goodness of fit perspective these last two models both have something to recommend them.

. *neither of these last two models, however, can be replicated by logistic regression.

. *Let's take a look at the saturated models from both logistic and loglinear,

> because that's the only logistic model that would account for D*V

. desmat: poisson count victim*defendant*death_penalty

-------------------------------------------------------------------------------

   Poisson regression

-------------------------------------------------------------------------------

   Dependent variable                                                    count

   Optimization:                                                            ml

   Number of observations:                                                   8

   Initial log likelihood:                                            -215.798

   Log likelihood:                                                     -17.841

   LR chi square:                                                      395.915

   Model degrees of freedom:                                                 7

   Pseudo R-squared:                                                     0.917

   Prob:                                                                 0.000

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   count

     victim

1      white                                                -0.623**     0.172

     defendant

2      white                                                -2.377**     0.348

     victim.defendant

3      white.white                                           3.309**     0.385

     death_penalty

4      1                                                    -2.783**     0.421

     victim.death_penalty

5      white.1                                               1.230*      0.536

     defendant.death_penalty

6      white.1                                             -14.711    2096.899

     victim.defendant.death_penalty

7      white.white.1                                        14.326    2096.899

8    _cons                                                   4.575**     0.102

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. *Here is where the zero in our data starts to choke us a bit.

. *one way around this problem is simply to add a constant to every cell

. *in any event, even if it seems a bit odd, we can actually add a constant to every cell

. gen count_plus1=count+1

 

. desmat: poisson  count_plus1 victim*defendant*death_penalty

-------------------------------------------------------------------------------

   Poisson regression

-------------------------------------------------------------------------------

   Dependent variable                                              count_plus1

   Optimization:                                                            ml

   Number of observations:                                                   8

   Initial log likelihood:                                            -209.216

   Log likelihood:                                                     -19.054

   LR chi square:                                                      380.324

   Model degrees of freedom:                                                 7

   Pseudo R-squared:                                                     0.909

   Prob:                                                                 0.000

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   count_plus1

     victim

1      white                                                -0.615**     0.171

     defendant

2      white                                                -2.282**     0.332

     victim.defendant

3      white.white                                           3.202**     0.370

     death_penalty

4      1                                                    -2.639**     0.391

     victim.death_penalty

5      white.1                                               1.154*      0.505

     defendant.death_penalty

6      white.1                                               0.336       1.119

     victim.defendant.death_penalty

7      white.white.1                                        -0.746       1.189

8    _cons                                                   4.585**     0.101

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. *replacing the count by count plus one makes the model look a lot nicer, and doesn't change any of the substantive intepretations.

. desmat: logistic  death_penalty victim*defendant [fweight= count_plus1], coef

-------------------------------------------------------------------------------

   logistic

-------------------------------------------------------------------------------

   Dependent variable                                            death_penalty

   Number of observations:                                                 334

   fweight:                                                        count_plus1

   Initial log likelihood:                                            -122.393

   Log likelihood:                                                    -119.485

   LR chi square:                                                        5.816

   Model degrees of freedom:                                                 3

   Pseudo R-squared:                                                     0.024

   Prob:                                                                 0.121

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   victim

1    white                                                   1.154*      0.505

   defendant

2    white                                                   0.336       1.119

   victim.defendant

3    white.white                                            -0.746       1.189

4  _cons                                                    -2.639**     0.391

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. lfit, table

 

Logistic model for death_penalty, goodness-of-fit test

 

  +--------------------------------------------------------+

  | Group |   Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |

  |-------+--------+-------+-------+-------+-------+-------|

  |     1 | 0.0667 |     7 |   7.0 |    98 |  98.0 |   105 |

  |     2 | 0.0909 |     1 |   1.0 |    10 |  10.0 |    11 |

  |     3 | 0.1307 |    20 |  20.0 |   133 | 133.0 |   153 |

  |     4 | 0.1846 |    12 |  12.0 |    53 |  53.0 |    65 |

  +--------------------------------------------------------+

 

  +-------------------------------------+

  | Group |   Prob | _x_1 | _x_2 | _x_3 |

  |-------+--------+------+------+------|

  |     1 | 0.0667 |    0 |    0 |    0 |

  |     2 | 0.0909 |    0 |    1 |    0 |

  |     3 | 0.1307 |    1 |    1 |    1 |

  |     4 | 0.1846 |    1 |    0 |    0 |

  +-------------------------------------+

 

       number of observations =       334

 number of covariate patterns =         4

              Pearson chi2(0) =         0.00

                  Prob > chi2 =              .

 

. exit, clear