class 10 log

opened on: 26 Oct 2005, 10:58:46

. edit

(4 vars, 8 obs pasted into editor)

- preserve

. *Agresti's death penalty data

. table victim defendant [fweight=count], contents (freq mean death_penalty) row col

-------------------------------------

| defendant

victim | black white Total

----------+--------------------------

black | 103 9 112

| .058252 0 .053571

white | 63 151 214

| .174603 .125828 .140187

Total | 166 160 326

| .10241 .11875 .110429

-------------------------------------

*notice that in the data overall, 11.8% of white defendants got the death penalty compared to

10.2% of black defendants. It is a small difference and it is in the opposite direction from what we expect.

. poisson count

Iteration 0: log likelihood = -215.79845

Iteration 1: log likelihood = -215.79845

Poisson regression Number of obs = 8

LR chi2(0) = 0.00

Prob > chi2 = .

Log likelihood = -215.79845 Pseudo R2 = 0.0000

------------------------------------------------------------------------------

count | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

_cons | 3.707456 .0553849 66.94 0.000 3.598903 3.816008

------------------------------------------------------------------------------

. poisgof

Goodness-of-fit chi2 = 395.9153

Prob > chi2(7) = 0.0000

. *The constant only model (were it to fit the data well) means that none of the variables matter at all, and so we're not surprised that this model doesn't fit.

. set linesize 75

. display exp(3.7075)

40.7518

. *The first interesting model is the mutually independent model

. desmat: poisson count defendant victim death_penalty

---------------------------------------------------------------------------

Poisson regression

---------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -86.805

LR chi square: 257.986

Model degrees of freedom: 3

Pseudo R-squared: 0.598

Prob: 0.000

---------------------------------------------------------------------------

nr Effect Coeff s.e.

---------------------------------------------------------------------------

count

defendant

1 white -0.037 0.111

victim

2 white 0.647** 0.117

death_penalty

3 1 -2.086** 0.177

4 _cons 3.927** 0.111

---------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 137.9293

Prob > chi2(4) = 0.0000

. *this model also doesn't fit, because the 3 variables are not mutually in

> dependent.

. *let me go back for just a second, to show you a loglinear model and a logistic model that are the same.

. desmat: poisson count death_penalty

---------------------------------------------------------------------------

Poisson regression

---------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -103.089

LR chi square: 225.419

Model degrees of freedom: 1

Pseudo R-squared: 0.522

Prob: 0.000

---------------------------------------------------------------------------

nr Effect Coeff s.e.

---------------------------------------------------------------------------

count

death_penalty

1 1 -2.086** 0.177

2 _cons 4.284** 0.059

---------------------------------------------------------------------------

* p < .05

** p < .01

. set linesize 79

. logistic death_penalty [fweight=count], coef

Logistic regression Number of obs = 326

LR chi2(0) = 0.00

Prob > chi2 = .

Log likelihood = -113.2564 Pseudo R2 = 0.0000

------------------------------------------------------------------------------

death_pena~y | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

_cons | -2.086362 .176709 -11.81 0.000 -2.432705 -1.740019

------------------------------------------------------------------------------

. *even though the logistic and loglinear models here are equivalent (the same coefficient and standard error), the log likelihoods are different, why?

. lfit, table

Logistic model for death_penalty, goodness-of-fit test

+--------------------------------------------------------+

| Group | Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |

|-------+--------+-------+-------+-------+-------+-------|

| 1 | 0.1104 | 36 | 36.0 | 290 | 290.0 | 326 |

+--------------------------------------------------------+

+----------------+

| Group | Prob |

|-------+--------|

| 1 | 0.1104 |

+----------------+

number of observations = 326

number of covariate patterns = 1

Pearson chi2(0) = 0.00

Prob > chi2 = .

. *see here how the logistic regression model collapsed out 8 cells of data into 2 cells, and that is why the fit statistics are different.

. desmat: poisson count defendant*death_penalty

-------------------------------------------------------------------------------

Poisson regression

-------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -102.923

LR chi square: 225.751

Model degrees of freedom: 3

Pseudo R-squared: 0.523

Prob: 0.000

-------------------------------------------------------------------------------

nr Effect Coeff s.e.

-------------------------------------------------------------------------------

count

defendant

1 white -0.055 0.117

death_penalty

2 1 -2.171** 0.256

defendant.death_penalty

3 white.1 0.166 0.354

4 _cons 4.311** 0.082

-------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 170.1642

Prob > chi2(4) = 0.0000

. *So it doesn't seem, at first blush, that there is a strong relationship between defendant's race and getting the death penalty (and whites seem slightly more likely to get the death penalty), which is exactly what

the simple descriptive statistics showed. Always start analyses with the simple descriptive stats.

. desmat: poisson count defendant*death_penalty victim

-------------------------------------------------------------------------------

Poisson regression

-------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -86.695

LR chi square: 258.207

Model degrees of freedom: 4

Pseudo R-squared: 0.598

Prob: 0.000

-------------------------------------------------------------------------------

nr Effect Coeff s.e.

-------------------------------------------------------------------------------

count

defendant

1 white -0.055 0.117

death_penalty

2 1 -2.171** 0.256

defendant.death_penalty

3 white.1 0.166 0.354

victim

4 white 0.647** 0.117

5 _cons 3.936** 0.112

-------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 137.7078

Prob > chi2(3) = 0.0000

. *Still, defendant* death penalty is insignificant.

. desmat: poisson count defendant*death_penalty victim*defendant

-------------------------------------------------------------------------------

Poisson regression

-------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -21.796

LR chi square: 388.005

Model degrees of freedom: 5

Pseudo R-squared: 0.899

Prob: 0.000

-------------------------------------------------------------------------------

nr Effect Coeff s.e.

-------------------------------------------------------------------------------

count

defendant

1 white -2.456** 0.350

death_penalty

2 1 -2.171** 0.256

defendant.death_penalty

3 white.1 0.166 0.354

victim

4 white -0.492** 0.160

victim.defendant

5 white.white 3.312** 0.379

6 _cons 4.527** 0.102

-------------------------------------------------------------------------------

* p < .05

** p < .01

. *There is a strong association between race of victim and race of defendant

. poisgof

Goodness-of-fit chi2 = 7.910102

Prob > chi2(2) = 0.0192

. desmat: poisson count death_penalty victim*defendant

-------------------------------------------------------------------------------

Poisson regression

-------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -21.907

LR chi square: 387.784

Model degrees of freedom: 4

Pseudo R-squared: 0.898

Prob: 0.000

-------------------------------------------------------------------------------

nr Effect Coeff s.e.

-------------------------------------------------------------------------------

count

death_penalty

1 1 -2.086** 0.177

victim

2 white -0.492** 0.160

defendant

3 white -2.438** 0.348

victim.defendant

4 white.white 3.312** 0.379

5 _cons 4.518** 0.100

-------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 8.131552

Prob > chi2(3) = 0.0434

. *The fact that this model fits the data reasonably well, and includes only the interaction between defendant's race and victim's race, and says nothing about the associations between race (either of defendant or victim) and death penalty, means that we can consider the possibility that race doesn't play any role in who gets the death penalty, at least based on this data.

. desmat: poisson count victim*defendant victim*death_penalty

-------------------------------------------------------------------------------

Poisson regression

-------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -18.782

LR chi square: 394.033

Model degrees of freedom: 5

Pseudo R-squared: 0.913

Prob: 0.000

-------------------------------------------------------------------------------

nr Effect Coeff s.e.

-------------------------------------------------------------------------------

count

victim

1 white -0.588** 0.164

defendant

2 white -2.438** 0.348

victim.defendant

3 white.white 3.312** 0.379

death_penalty

4 1 -2.872** 0.420

victim.death_penalty

5 white.1 1.058* 0.464

6 _cons 4.580** 0.101

-------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 1.881837

Prob > chi2(2) = 0.3903

. *This model is preferred by the LRT because it improves the goodness of fit by about 6 on 1 df,

. display chi2tail(1,6.3)

.0120738

. *So from a goodness of fit perspective these last two models both have something to recommend them.

. *neither of these last two models, however, can be replicated by logistic regression.

. *Let's take a look at the saturated models from both logistic and loglinear,

> because that's the only logistic model that would account for D*V

. desmat: poisson count victim*defendant*death_penalty

-------------------------------------------------------------------------------

Poisson regression

-------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 8

Initial log likelihood: -215.798

Log likelihood: -17.841

LR chi square: 395.915

Model degrees of freedom: 7

Pseudo R-squared: 0.917

Prob: 0.000

-------------------------------------------------------------------------------

nr Effect Coeff s.e.

-------------------------------------------------------------------------------

count

victim

1 white -0.623** 0.172

defendant

2 white -2.377** 0.348

victim.defendant

3 white.white 3.309** 0.385

death_penalty

4 1 -2.783** 0.421

victim.death_penalty

5 white.1 1.230* 0.536

defendant.death_penalty

6 white.1 -14.711 2096.899

victim.defendant.death_penalty

7 white.white.1 14.326 2096.899

8 _cons 4.575** 0.102

-------------------------------------------------------------------------------

* p < .05

** p < .01

. *Here is where the zero in our data starts to choke us a bit.

. *one way around this problem is simply to add a constant to every cell

. *in any event, even if it seems a bit odd, we can actually add a constant to every cell

. gen count_plus1=count+1

. desmat: poisson count_plus1 victim*defendant*death_penalty

-------------------------------------------------------------------------------

Poisson regression

-------------------------------------------------------------------------------

Dependent variable count_plus1

Optimization: ml

Number of observations: 8

Initial log likelihood: -209.216

Log likelihood: -19.054

LR chi square: 380.324

Model degrees of freedom: 7

Pseudo R-squared: 0.909

Prob: 0.000

-------------------------------------------------------------------------------

nr Effect Coeff s.e.

-------------------------------------------------------------------------------

count_plus1

victim

1 white -0.615** 0.171

defendant

2 white -2.282** 0.332

victim.defendant

3 white.white 3.202** 0.370

death_penalty

4 1 -2.639** 0.391

victim.death_penalty

5 white.1 1.154* 0.505

defendant.death_penalty

6 white.1 0.336 1.119

victim.defendant.death_penalty

7 white.white.1 -0.746 1.189

8 _cons 4.585** 0.101

-------------------------------------------------------------------------------

* p < .05

** p < .01

. *replacing the count by count plus one makes the model look a lot nicer, and doesn't change any of the substantive intepretations.

. desmat: logistic death_penalty victim*defendant [fweight= count_plus1], coef

-------------------------------------------------------------------------------

logistic

-------------------------------------------------------------------------------

Dependent variable death_penalty

Number of observations: 334

fweight: count_plus1

Initial log likelihood: -122.393

Log likelihood: -119.485

LR chi square: 5.816

Model degrees of freedom: 3

Pseudo R-squared: 0.024

Prob: 0.121

-------------------------------------------------------------------------------

nr Effect Coeff s.e.

-------------------------------------------------------------------------------

victim

1 white 1.154* 0.505

defendant

2 white 0.336 1.119

victim.defendant

3 white.white -0.746 1.189

4 _cons -2.639** 0.391

-------------------------------------------------------------------------------

* p < .05

** p < .01

. lfit, table

Logistic model for death_penalty, goodness-of-fit test

+--------------------------------------------------------+

| Group | Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |

|-------+--------+-------+-------+-------+-------+-------|

| 1 | 0.0667 | 7 | 7.0 | 98 | 98.0 | 105 |

| 2 | 0.0909 | 1 | 1.0 | 10 | 10.0 | 11 |

| 3 | 0.1307 | 20 | 20.0 | 133 | 133.0 | 153 |

| 4 | 0.1846 | 12 | 12.0 | 53 | 53.0 | 65 |

+--------------------------------------------------------+

+-------------------------------------+

| Group | Prob | _x_1 | _x_2 | _x_3 |

|-------+--------+------+------+------|

| 1 | 0.0667 | 0 | 0 | 0 |

| 2 | 0.0909 | 0 | 1 | 0 |

| 3 | 0.1307 | 1 | 1 | 1 |

| 4 | 0.1846 | 1 | 0 | 0 |

+-------------------------------------+

number of observations = 334

number of covariate patterns = 4

Pearson chi2(0) = 0.00

Prob > chi2 = .

. exit, clear