class 11 log

log type: text

opened on: 3 Nov 2003, 11:10:18

. set linesize 79

. use "HW2 with QS.dta", clear

. *What we are going to cover today is a little bit more of a systematic

. *approach to fitting this data.

. *One approach to this kind of data is the Quasi-Symmetry model

. *This is described in Hout

. *QS models are good for 2 dimensional tables with symmetric row and column categories

. table husb wife, contents (mean QS)

-----------------------------------------------------------------------

| wife

husb | black mexican oth hisp all others white

-----------+-----------------------------------------------------------

black | 1 21 31 41 51

mexican | 21 2 32 42 52

oth hisp | 31 32 3 43 53

all others | 41 42 43 4 54

white | 51 52 53 54 5

-----------------------------------------------------------------------

. *This is the full set of symmetric interactions.

. *There are 15 of them, but they are not all mutually independent.

. codebook QS

------------------------------------------------------------------------------------------

QS (unlabeled)

------------------------------------------------------------------------------------------

type: numeric (byte)

range: [1,54] units: 1

unique values: 15 missing .: 0/25

mean: 34.2

std. dev: 18.6123

percentiles: 10% 25% 50% 75% 90%

3 21 41 51 53

. desmat husb wife QS

Desmat generated the following design matrix:

nr Variables Term Parameterization

First Last

1 _x_1 _x_4 husb ind(1)

2 _x_5 _x_8 wife ind(1)

3 _x_9 _x_18 QS ind(1)

. *desmat drops the colinear terms.

. desmat: poisson count husb wife QS

------------------------------------------------------------------------------------------

Poisson regression

------------------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 25

Initial log likelihood: -80138.505

Log likelihood: -89.596

LR chi square: 160097.818

Model degrees of freedom: 18

Pseudo R-squared: 0.999

Prob: 0.000

------------------------------------------------------------------------------------------

nr Effect Coeff s.e.

------------------------------------------------------------------------------------------

count

husb

1 mexican -0.431** 0.051

2 oth hisp -1.866** 0.065

3 all others -1.190** 0.057

4 white 0.625** 0.049

wife

5 mexican 0.399** 0.051

6 oth hisp -0.970** 0.065

7 all others -0.193** 0.057

8 white 1.319** 0.049

9 21 -4.596** 0.109

10 31 -3.814** 0.150

11 32 -1.956** 0.069

12 41 -4.323** 0.132

13 42 -3.148** 0.078

14 43 -3.314** 0.171

15 51 -4.274** 0.059

16 52 -2.284** 0.023

17 53 -2.047** 0.050

18 54 -2.550** 0.038

19 _cons 8.312** 0.016

------------------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 1.379208

Prob > chi2(6) = 0.9671

. *The quasi-symmetry model fits the data really well

. *But we've lost our favorite endogamy terms, and they've been replaced

. *By a full complement of off-diagonal symmetric interactions.

. table husb wife, contents (mean QS)

-----------------------------------------------------------------------

| wife

husb | black mexican oth hisp all others white

-----------+-----------------------------------------------------------

black | 1 21 31 41 51

mexican | 21 2 32 42 52

oth hisp | 31 32 3 43 53

all others | 41 42 43 4 54

white | 51 52 53 54 5

-----------------------------------------------------------------------

. table husb wife, contents (mean QS2)

-----------------------------------------------------------------------

| wife

husb | black mexican oth hisp all others white

-----------+-----------------------------------------------------------

black | 0 21 31 0 51

mexican | 21 0 32 0 52

oth hisp | 31 32 0 0 0

all others | 0 0 0 0 0

white | 51 52 0 0 0

-----------------------------------------------------------------------

. desmat: poisson count husb wife race_endog QS2

------------------------------------------------------------------------------------------

Poisson regression

------------------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 25

Initial log likelihood: -80138.505

Log likelihood: -89.596

LR chi square: 160097.818

Model degrees of freedom: 18

Pseudo R-squared: 0.999

Prob: 0.000

------------------------------------------------------------------------------------------

nr Effect Coeff s.e.

------------------------------------------------------------------------------------------

count

husb

1 mexican 0.743** 0.152

2 oth hisp -0.858** 0.214

3 all others -0.684** 0.219

4 white 2.397** 0.136

wife

5 mexican 1.573** 0.165

6 oth hisp 0.039 0.223

7 all others 0.313 0.230

8 white 3.092** 0.150

race_endog

9 1 4.828** 0.314

10 2 2.480** 0.232

11 3 2.811** 0.186

12 4 3.817** 0.177

13 5 1.283** 0.175

QS2

14 21 -0.942** 0.253

15 31 0.006 0.200

16 32 0.690** 0.110

17 51 -1.219** 0.221

18 52 -0.402* 0.188

19 _cons 3.484** 0.313

------------------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 1.379208

Prob > chi2(6) = 0.9671

. *This presentation of the interactions is a bit more consistent

. *With the set of models from HW2, but the actual model is simply QS, the

. *Same as the quasi-symmetry model we ran above.

. desmat husb wife race_endog QS2

Desmat generated the following design matrix:

nr Variables Term Parameterization

First Last

1 _x_1 _x_4 husb ind(1)

2 _x_5 _x_8 wife ind(1)

3 _x_9 _x_13 race_endog ind(0)

4 _x_14 _x_18 QS2 ind(0)

. sw poisson count (_x_1-_x_8) _x_9-_x_18, forward pe(.01) pr(.1)

begin with empty model

p = 0.0000 < 0.0100 adding _x_1 _x_2 _x_3 _x_4 _x_5 _x_6 _x_7 _x_8

p = 0.0000 < 0.0100 adding _x_9

p = 0.0000 < 0.0100 adding _x_10

p = 0.0000 < 0.0100 adding _x_12

p = 0.0000 < 0.0100 adding _x_13

p = 0.0000 < 0.0100 adding _x_11

p = 0.0000 < 0.0100 adding _x_16

p = 0.0000 < 0.0100 adding _x_17

p = 0.0002 < 0.0100 adding _x_14

Poisson regression Number of obs = 25

LR chi2(16) = 160092.89

Prob > chi2 = 0.0000

Log likelihood = -92.058605 Pseudo R2 = 0.9989

------------------------------------------------------------------------------

count | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

_x_1 | .6580387 .1237163 5.32 0.000 .4155592 .9005183

_x_2 | -.5675102 .1336998 -4.24 0.000 -.829557 -.3054634

_x_3 | -.3744226 .1266318 -2.96 0.003 -.6226164 -.1262289

_x_4 | 2.387004 .1030538 23.16 0.000 2.185023 2.588986

_x_5 | 1.48623 .1386654 10.72 0.000 1.214451 1.758009

_x_6 | .3329503 .1481206 2.25 0.025 .0426393 .6232614

_x_7 | .6221063 .1419782 4.38 0.000 .3438341 .9003785

_x_8 | 3.080996 .1205447 25.56 0.000 2.844733 3.317259

_x_9 | 5.127894 .210949 24.31 0.000 4.714442 5.541347

_x_10 | 2.951956 .0833704 35.41 0.000 2.788553 3.115359

_x_12 | 3.497347 .0853671 40.97 0.000 3.330031 3.664663

_x_13 | 1.603523 .0796469 20.13 0.000 1.447418 1.759628

_x_11 | 2.526537 .1204165 20.98 0.000 2.290525 2.762549

_x_16 | .7836307 .1017227 7.70 0.000 .5842579 .9830034

_x_17 | -.9086142 .1332381 -6.82 0.000 -1.169756 -.6474723

_x_14 | -.5558246 .1472988 -3.77 0.000 -.844525 -.2671243

_cons | 3.184486 .2103664 15.14 0.000 2.772176 3.596797

------------------------------------------------------------------------------

. poisgof

Goodness-of-fit chi2 = 6.304709

Prob > chi2(8) = 0.6131

. desrep

------------------------------------------------------------------------------------------

Poisson regression

------------------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 25

Initial log likelihood: -80138.505

Log likelihood: -92.059

LR chi square: 160092.893

Model degrees of freedom: 16

Pseudo R-squared: 0.999

Prob: 0.000

------------------------------------------------------------------------------------------

nr Effect Coeff s.e.

------------------------------------------------------------------------------------------

count

husb

1 mexican 0.658** 0.124

2 oth hisp -0.568** 0.134

3 all others -0.374** 0.127

4 white 2.387** 0.103

wife

5 mexican 1.486** 0.139

6 oth hisp 0.333* 0.148

7 all others 0.622** 0.142

8 white 3.081** 0.121

race_endog

9 1 5.128** 0.211

10 2 2.952** 0.083

11 4 3.497** 0.085

12 5 1.604** 0.080

13 3 2.527** 0.120

QS2

14 32 0.784** 0.102

15 51 -0.909** 0.133

16 21 -0.556** 0.147

17 _cons 3.184** 0.210

------------------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 6.304709

Prob > chi2(8) = 0.6131

. *This is the result of a forward stepwise process.

. sw poisson count (_x_1-_x_8) _x_9-_x_18, forward pe(.05) pr(.1)

begin with empty model

p = 0.0000 < 0.0500 adding _x_1 _x_2 _x_3 _x_4 _x_5 _x_6 _x_7 _x_8

p = 0.0000 < 0.0500 adding _x_9

p = 0.0000 < 0.0500 adding _x_10

p = 0.0000 < 0.0500 adding _x_12

p = 0.0000 < 0.0500 adding _x_13

p = 0.0000 < 0.0500 adding _x_11

p = 0.0000 < 0.0500 adding _x_16

p = 0.0000 < 0.0500 adding _x_17

p = 0.0002 < 0.0500 adding _x_14

p = 0.0326 < 0.0500 adding _x_18

Poisson regression Number of obs = 25

LR chi2(17) = 160097.82

Prob > chi2 = 0.0000

Log likelihood = -89.596292 Pseudo R2 = 0.9989

------------------------------------------------------------------------------

count | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

_x_1 | .7410304 .1282705 5.78 0.000 .4896249 .9924359

_x_2 | -.8602365 .196971 -4.37 0.000 -1.246293 -.4741804

_x_3 | -.6866007 .1978931 -3.47 0.001 -1.074464 -.2987374

_x_4 | 2.394867 .1030996 23.23 0.000 2.192795 2.596938

_x_5 | 1.570528 .1428993 10.99 0.000 1.290451 1.850606

_x_6 | .0367311 .2079401 0.18 0.860 -.370824 .4442863

_x_7 | .3096806 .2080673 1.49 0.137 -.0981238 .7174851

_x_8 | 3.089169 .1205587 25.62 0.000 2.852878 3.325459

_x_9 | 4.823029 .258398 18.67 0.000 4.316578 5.329479

_x_10 | 2.4798 .2316559 10.70 0.000 2.025763 2.933838

_x_12 | 3.817085 .1769228 21.57 0.000 3.470322 4.163847

_x_13 | 1.282621 .1746304 7.34 0.000 .940352 1.624891

_x_11 | 2.810617 .1856191 15.14 0.000 2.44681 3.174424

_x_16 | .6896763 .1091149 6.32 0.000 .475815 .9035376

_x_17 | -1.22155 .2023955 -6.04 0.000 -1.618238 -.8248617

_x_14 | -.9445918 .234645 -4.03 0.000 -1.404488 -.484696

_x_18 | -.4024037 .1883423 -2.14 0.033 -.7715477 -.0332596

_cons | 3.489352 .2579226 13.53 0.000 2.983833 3.994871

------------------------------------------------------------------------------

. desrep

------------------------------------------------------------------------------------------

Poisson regression

------------------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 25

Initial log likelihood: -80138.505

Log likelihood: -89.596

LR chi square: 160097.817

Model degrees of freedom: 17

Pseudo R-squared: 0.999

Prob: 0.000

------------------------------------------------------------------------------------------

nr Effect Coeff s.e.

------------------------------------------------------------------------------------------

count

husb

1 mexican 0.741** 0.128

2 oth hisp -0.860** 0.197

3 all others -0.687** 0.198

4 white 2.395** 0.103

wife

5 mexican 1.571** 0.143

6 oth hisp 0.037 0.208

7 all others 0.310 0.208

8 white 3.089** 0.121

race_endog

9 1 4.823** 0.258

10 2 2.480** 0.232

11 4 3.817** 0.177

12 5 1.283** 0.175

13 3 2.811** 0.186

QS2

14 32 0.690** 0.109

15 51 -1.222** 0.202

16 21 -0.945** 0.235

17 52 -0.402* 0.188

18 _cons 3.489** 0.258

------------------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 1.380085

Prob > chi2(7) = 0.9862

. *If we lower the threshold for entry into the model, from pe (.01) to pe (.05), we get one more term.

. sw poisson count (_x_1-_x_8) _x_9-_x_18, backward pe(.05) pr(.1)

backward not allowed

r(198);

. sw poisson count (_x_1-_x_8) _x_9-_x_18, pr(.1)

begin with full model

p = 0.9764 >= 0.1000 removing _x_15

Poisson regression Number of obs = 25

LR chi2(17) = 160097.82

Prob > chi2 = 0.0000

Log likelihood = -89.596292 Pseudo R2 = 0.9989

------------------------------------------------------------------------------

count | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

_x_1 | .7410304 .1282705 5.78 0.000 .4896249 .9924359

_x_2 | -.8602365 .196971 -4.37 0.000 -1.246293 -.4741804

_x_3 | -.6866007 .1978931 -3.47 0.001 -1.074464 -.2987374

_x_4 | 2.394867 .1030996 23.23 0.000 2.192795 2.596938

_x_5 | 1.570528 .1428993 10.99 0.000 1.290451 1.850606

_x_6 | .0367311 .2079401 0.18 0.860 -.370824 .4442863

_x_7 | .3096806 .2080673 1.49 0.137 -.0981238 .7174851

_x_8 | 3.089169 .1205587 25.62 0.000 2.852878 3.325459

_x_9 | 4.823029 .258398 18.67 0.000 4.316578 5.329479

_x_10 | 2.4798 .2316559 10.70 0.000 2.025763 2.933838

_x_11 | 2.810617 .1856191 15.14 0.000 2.44681 3.174424

_x_12 | 3.817085 .1769228 21.57 0.000 3.470322 4.163847

_x_13 | 1.282621 .1746304 7.34 0.000 .940352 1.624891

_x_14 | -.9445918 .234645 -4.03 0.000 -1.404488 -.484696

_x_18 | -.4024037 .1883423 -2.14 0.033 -.7715477 -.0332596

_x_16 | .6896763 .1091149 6.32 0.000 .475815 .9035376

_x_17 | -1.22155 .2023955 -6.04 0.000 -1.618238 -.8248617

_cons | 3.489352 .2579226 13.53 0.000 2.983833 3.994871

------------------------------------------------------------------------------

. desrep

------------------------------------------------------------------------------------------

Poisson regression

------------------------------------------------------------------------------------------

Dependent variable count

Optimization: ml

Number of observations: 25

Initial log likelihood: -80138.505

Log likelihood: -89.596

LR chi square: 160097.817

Model degrees of freedom: 17

Pseudo R-squared: 0.999

Prob: 0.000

------------------------------------------------------------------------------------------

nr Effect Coeff s.e.

------------------------------------------------------------------------------------------

count

husb

1 mexican 0.741** 0.128

2 oth hisp -0.860** 0.197

3 all others -0.687** 0.198

4 white 2.395** 0.103

wife

5 mexican 1.571** 0.143

6 oth hisp 0.037 0.208

7 all others 0.310 0.208

8 white 3.089** 0.121

race_endog

9 1 4.823** 0.258

10 2 2.480** 0.232

11 3 2.811** 0.186

12 4 3.817** 0.177

13 5 1.283** 0.175

QS2

14 21 -0.945** 0.235

15 52 -0.402* 0.188

16 32 0.690** 0.109

17 51 -1.222** 0.202

18 _cons 3.489** 0.258

------------------------------------------------------------------------------------------

* p < .05

** p < .01

. poisgof

Goodness-of-fit chi2 = 1.380085

Prob > chi2(7) = 0.9862

. *In this case backward and forward stepwise processes led to the same model.

*Backward stepwise means starting with the full model and throwing away insignificant terms

*forward stepwise means starting with some base model, and adding terms that are significant.

. exit, clear