log type:  text

 opened on:   3 Nov 2003, 11:10:18

 

. set linesize 79

 

. use "HW2 with QS.dta", clear

 

. *What we are going to cover today is a little bit more of a systematic

. *approach to fitting this data.

. *One approach to this kind of data is the Quasi-Symmetry model

. *This is described in Hout

. *QS models are good for 2 dimensional tables with symmetric row and column categories

. table husb wife, contents (mean QS)

 

-----------------------------------------------------------------------

           |                            wife                          

      husb |      black     mexican    oth hisp  all others       white

-----------+-----------------------------------------------------------

     black |          1          21          31          41          51

   mexican |         21           2          32          42          52

  oth hisp |         31          32           3          43          53

all others |         41          42          43           4          54

     white |         51          52          53          54           5

-----------------------------------------------------------------------

 

. *This is the full set of symmetric interactions.

. *There are 15 of them, but they are not all mutually independent.

. codebook QS

 

------------------------------------------------------------------------------------------

QS                                                                             (unlabeled)

------------------------------------------------------------------------------------------

 

                  type:  numeric (byte)

 

                 range:  [1,54]                       units:  1

         unique values:  15                       missing .:  0/25

 

                  mean:      34.2

              std. dev:   18.6123

 

           percentiles:        10%       25%       50%       75%       90%

                                 3        21        41        51        53

 

. desmat husb wife QS

 

Desmat generated the following design matrix:

 

nr   Variables       Term                        Parameterization

     First    Last

 

 1    _x_1    _x_4   husb                        ind(1)

 2    _x_5    _x_8   wife                        ind(1)

 3    _x_9   _x_18   QS                          ind(1)

 

. *desmat drops the colinear terms.

. desmat: poisson count husb wife QS

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                               count

   Optimization:                                                                       ml

   Number of observations:                                                             25

   Initial log likelihood:                                                     -80138.505

   Log likelihood:                                                                -89.596

   LR chi square:                                                              160097.818

   Model degrees of freedom:                                                           18

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                               Coeff        s.e.

------------------------------------------------------------------------------------------

   count

     husb

1      mexican                                                         -0.431**     0.051

2      oth hisp                                                        -1.866**     0.065

3      all others                                                      -1.190**     0.057

4      white                                                            0.625**     0.049

     wife

5      mexican                                                          0.399**     0.051

6      oth hisp                                                        -0.970**     0.065

7      all others                                                      -0.193**     0.057

8      white                                                            1.319**     0.049

     QS

9      21                                                              -4.596**     0.109

10     31                                                              -3.814**     0.150

11     32                                                              -1.956**     0.069

12     41                                                              -4.323**     0.132

13     42                                                              -3.148**     0.078

14     43                                                              -3.314**     0.171

15     51                                                              -4.274**     0.059

16     52                                                              -2.284**     0.023

17     53                                                              -2.047**     0.050

18     54                                                              -2.550**     0.038

19   _cons                                                              8.312**     0.016

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  1.379208

         Prob > chi2(6)        =    0.9671

 

. *The quasi-symmetry model fits the data really well

. *But we've lost our favorite endogamy terms, and they've been replaced

. *By a full complement of off-diagonal symmetric interactions.

. table husb wife, contents (mean QS)

 

-----------------------------------------------------------------------

           |                            wife                          

      husb |      black     mexican    oth hisp  all others       white

-----------+-----------------------------------------------------------

     black |          1          21          31          41          51

   mexican |         21           2          32          42          52

  oth hisp |         31          32           3          43          53

all others |         41          42          43           4          54

     white |         51          52          53          54           5

-----------------------------------------------------------------------

 

. table husb wife, contents (mean QS2)

 

-----------------------------------------------------------------------

           |                            wife                          

      husb |      black     mexican    oth hisp  all others       white

-----------+-----------------------------------------------------------

     black |          0          21          31           0          51

   mexican |         21           0          32           0          52

  oth hisp |         31          32           0           0           0

all others |          0           0           0           0           0

     white |         51          52           0           0           0

-----------------------------------------------------------------------

 

. desmat: poisson count husb wife  race_endog QS2

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                               count

   Optimization:                                                                       ml

   Number of observations:                                                             25

   Initial log likelihood:                                                     -80138.505

   Log likelihood:                                                                -89.596

   LR chi square:                                                              160097.818

   Model degrees of freedom:                                                           18

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                               Coeff        s.e.

------------------------------------------------------------------------------------------

   count

     husb

1      mexican                                                          0.743**     0.152

2      oth hisp                                                        -0.858**     0.214

3      all others                                                      -0.684**     0.219

4      white                                                            2.397**     0.136

     wife

5      mexican                                                          1.573**     0.165

6      oth hisp                                                         0.039       0.223

7      all others                                                       0.313       0.230

8      white                                                            3.092**     0.150

     race_endog

9      1                                                                4.828**     0.314

10     2                                                                2.480**     0.232

11     3                                                                2.811**     0.186

12     4                                                                3.817**     0.177

13     5                                                                1.283**     0.175

     QS2

14     21                                                              -0.942**     0.253

15     31                                                               0.006       0.200

16     32                                                               0.690**     0.110

17     51                                                              -1.219**     0.221

18     52                                                              -0.402*      0.188

19   _cons                                                              3.484**     0.313

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  1.379208

         Prob > chi2(6)        =    0.9671

 

. *This presentation of the interactions is a bit more consistent

. *With the set of models from HW2, but the actual model is simply QS, the

. *Same as the quasi-symmetry model we ran above.

. desmat husb wife race_endog QS2

 

Desmat generated the following design matrix:

 

nr   Variables       Term                        Parameterization

     First    Last

 

 1    _x_1    _x_4   husb                        ind(1)

 2    _x_5    _x_8   wife                        ind(1)

 3    _x_9   _x_13   race_endog                  ind(0)

 4   _x_14   _x_18   QS2                         ind(0)

 

. sw poisson count (_x_1-_x_8) _x_9-_x_18, forward pe(.01) pr(.1)

                      begin with empty model

p = 0.0000 <  0.0100  adding   _x_1 _x_2 _x_3 _x_4 _x_5 _x_6 _x_7 _x_8

p = 0.0000 <  0.0100  adding   _x_9

p = 0.0000 <  0.0100  adding   _x_10

p = 0.0000 <  0.0100  adding   _x_12

p = 0.0000 <  0.0100  adding   _x_13

p = 0.0000 <  0.0100  adding   _x_11

p = 0.0000 <  0.0100  adding   _x_16

p = 0.0000 <  0.0100  adding   _x_17

p = 0.0002 <  0.0100  adding   _x_14

 

Poisson regression                                Number of obs   =         25

                                                  LR chi2(16)     =  160092.89

                                                  Prob > chi2     =     0.0000

Log likelihood = -92.058605                       Pseudo R2       =     0.9989

 

------------------------------------------------------------------------------

       count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        _x_1 |   .6580387   .1237163     5.32   0.000     .4155592    .9005183

        _x_2 |  -.5675102   .1336998    -4.24   0.000     -.829557   -.3054634

        _x_3 |  -.3744226   .1266318    -2.96   0.003    -.6226164   -.1262289

        _x_4 |   2.387004   .1030538    23.16   0.000     2.185023    2.588986

        _x_5 |    1.48623   .1386654    10.72   0.000     1.214451    1.758009

        _x_6 |   .3329503   .1481206     2.25   0.025     .0426393    .6232614

        _x_7 |   .6221063   .1419782     4.38   0.000     .3438341    .9003785

        _x_8 |   3.080996   .1205447    25.56   0.000     2.844733    3.317259

        _x_9 |   5.127894    .210949    24.31   0.000     4.714442    5.541347

       _x_10 |   2.951956   .0833704    35.41   0.000     2.788553    3.115359

       _x_12 |   3.497347   .0853671    40.97   0.000     3.330031    3.664663

       _x_13 |   1.603523   .0796469    20.13   0.000     1.447418    1.759628

       _x_11 |   2.526537   .1204165    20.98   0.000     2.290525    2.762549

       _x_16 |   .7836307   .1017227     7.70   0.000     .5842579    .9830034

       _x_17 |  -.9086142   .1332381    -6.82   0.000    -1.169756   -.6474723

       _x_14 |  -.5558246   .1472988    -3.77   0.000     -.844525   -.2671243

       _cons |   3.184486   .2103664    15.14   0.000     2.772176    3.596797

------------------------------------------------------------------------------

 

. poisgof

 

         Goodness-of-fit chi2  =  6.304709

         Prob > chi2(8)        =    0.6131

 

. desrep

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                               count

   Optimization:                                                                       ml

   Number of observations:                                                             25

   Initial log likelihood:                                                     -80138.505

   Log likelihood:                                                                -92.059

   LR chi square:                                                              160092.893

   Model degrees of freedom:                                                           16

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                               Coeff        s.e.

------------------------------------------------------------------------------------------

   count

     husb

1      mexican                                                          0.658**     0.124

2      oth hisp                                                        -0.568**     0.134

3      all others                                                      -0.374**     0.127

4      white                                                            2.387**     0.103

     wife

5      mexican                                                          1.486**     0.139

6      oth hisp                                                         0.333*      0.148

7      all others                                                       0.622**     0.142

8      white                                                            3.081**     0.121

     race_endog

9      1                                                                5.128**     0.211

10     2                                                                2.952**     0.083

11     4                                                                3.497**     0.085

12     5                                                                1.604**     0.080

13     3                                                                2.527**     0.120

     QS2

14     32                                                               0.784**     0.102

15     51                                                              -0.909**     0.133

16     21                                                              -0.556**     0.147

17   _cons                                                              3.184**     0.210

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  6.304709

         Prob > chi2(8)        =    0.6131

 

. *This is the result of a forward stepwise process.

. sw poisson count (_x_1-_x_8) _x_9-_x_18, forward pe(.05) pr(.1)

                      begin with empty model

p = 0.0000 <  0.0500  adding   _x_1 _x_2 _x_3 _x_4 _x_5 _x_6 _x_7 _x_8

p = 0.0000 <  0.0500  adding   _x_9

p = 0.0000 <  0.0500  adding   _x_10

p = 0.0000 <  0.0500  adding   _x_12

p = 0.0000 <  0.0500  adding   _x_13

p = 0.0000 <  0.0500  adding   _x_11

p = 0.0000 <  0.0500  adding   _x_16

p = 0.0000 <  0.0500  adding   _x_17

p = 0.0002 <  0.0500  adding   _x_14

p = 0.0326 <  0.0500  adding   _x_18

 

Poisson regression                                Number of obs   =         25

                                                  LR chi2(17)     =  160097.82

                                                  Prob > chi2     =     0.0000

Log likelihood = -89.596292                       Pseudo R2       =     0.9989

 

------------------------------------------------------------------------------

       count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        _x_1 |   .7410304   .1282705     5.78   0.000     .4896249    .9924359

        _x_2 |  -.8602365    .196971    -4.37   0.000    -1.246293   -.4741804

        _x_3 |  -.6866007   .1978931    -3.47   0.001    -1.074464   -.2987374

        _x_4 |   2.394867   .1030996    23.23   0.000     2.192795    2.596938

        _x_5 |   1.570528   .1428993    10.99   0.000     1.290451    1.850606

        _x_6 |   .0367311   .2079401     0.18   0.860     -.370824    .4442863

        _x_7 |   .3096806   .2080673     1.49   0.137    -.0981238    .7174851

        _x_8 |   3.089169   .1205587    25.62   0.000     2.852878    3.325459

        _x_9 |   4.823029    .258398    18.67   0.000     4.316578    5.329479

       _x_10 |     2.4798   .2316559    10.70   0.000     2.025763    2.933838

       _x_12 |   3.817085   .1769228    21.57   0.000     3.470322    4.163847

       _x_13 |   1.282621   .1746304     7.34   0.000      .940352    1.624891

       _x_11 |   2.810617   .1856191    15.14   0.000      2.44681    3.174424

       _x_16 |   .6896763   .1091149     6.32   0.000      .475815    .9035376

       _x_17 |   -1.22155   .2023955    -6.04   0.000    -1.618238   -.8248617

       _x_14 |  -.9445918    .234645    -4.03   0.000    -1.404488    -.484696

       _x_18 |  -.4024037   .1883423    -2.14   0.033    -.7715477   -.0332596

       _cons |   3.489352   .2579226    13.53   0.000     2.983833    3.994871

------------------------------------------------------------------------------

 

. desrep

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                               count

   Optimization:                                                                       ml

   Number of observations:                                                             25

   Initial log likelihood:                                                     -80138.505

   Log likelihood:                                                                -89.596

   LR chi square:                                                              160097.817

   Model degrees of freedom:                                                           17

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                               Coeff        s.e.

------------------------------------------------------------------------------------------

   count

     husb

1      mexican                                                          0.741**     0.128

2      oth hisp                                                        -0.860**     0.197

3      all others                                                      -0.687**     0.198

4      white                                                            2.395**     0.103

     wife

5      mexican                                                          1.571**     0.143

6      oth hisp                                                         0.037       0.208

7      all others                                                       0.310       0.208

8      white                                                            3.089**     0.121

     race_endog

9      1                                                                4.823**     0.258

10     2                                                                2.480**     0.232

11     4                                                                3.817**     0.177

12     5                                                                1.283**     0.175

13     3                                                                2.811**     0.186

     QS2

14     32                                                               0.690**     0.109

15     51                                                              -1.222**     0.202

16     21                                                              -0.945**     0.235

17     52                                                              -0.402*      0.188

18   _cons                                                              3.489**     0.258

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  1.380085

         Prob > chi2(7)        =    0.9862

 

. *If we lower the threshold for entry into the model, from pe (.01) to pe (.05), we get one more term.

. sw poisson count (_x_1-_x_8) _x_9-_x_18, backward pe(.05) pr(.1)

backward not allowed

r(198);

 

. sw poisson count (_x_1-_x_8) _x_9-_x_18, pr(.1)

                      begin with full model

p = 0.9764 >= 0.1000  removing _x_15

 

Poisson regression                                Number of obs   =         25

                                                  LR chi2(17)     =  160097.82

                                                  Prob > chi2     =     0.0000

Log likelihood = -89.596292                       Pseudo R2       =     0.9989

 

------------------------------------------------------------------------------

       count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

        _x_1 |   .7410304   .1282705     5.78   0.000     .4896249    .9924359

        _x_2 |  -.8602365    .196971    -4.37   0.000    -1.246293   -.4741804

        _x_3 |  -.6866007   .1978931    -3.47   0.001    -1.074464   -.2987374

        _x_4 |   2.394867   .1030996    23.23   0.000     2.192795    2.596938

        _x_5 |   1.570528   .1428993    10.99   0.000     1.290451    1.850606

        _x_6 |   .0367311   .2079401     0.18   0.860     -.370824    .4442863

        _x_7 |   .3096806   .2080673     1.49   0.137    -.0981238    .7174851

        _x_8 |   3.089169   .1205587    25.62   0.000     2.852878    3.325459

        _x_9 |   4.823029    .258398    18.67   0.000     4.316578    5.329479

       _x_10 |     2.4798   .2316559    10.70   0.000     2.025763    2.933838

       _x_11 |   2.810617   .1856191    15.14   0.000      2.44681    3.174424

       _x_12 |   3.817085   .1769228    21.57   0.000     3.470322    4.163847

       _x_13 |   1.282621   .1746304     7.34   0.000      .940352    1.624891

       _x_14 |  -.9445918    .234645    -4.03   0.000    -1.404488    -.484696

       _x_18 |  -.4024037   .1883423    -2.14   0.033    -.7715477   -.0332596

       _x_16 |   .6896763   .1091149     6.32   0.000      .475815    .9035376

       _x_17 |   -1.22155   .2023955    -6.04   0.000    -1.618238   -.8248617

       _cons |   3.489352   .2579226    13.53   0.000     2.983833    3.994871

------------------------------------------------------------------------------

 

. desrep

------------------------------------------------------------------------------------------

   Poisson regression

------------------------------------------------------------------------------------------

   Dependent variable                                                               count

   Optimization:                                                                       ml

   Number of observations:                                                             25

   Initial log likelihood:                                                     -80138.505

   Log likelihood:                                                                -89.596

   LR chi square:                                                              160097.817

   Model degrees of freedom:                                                           17

   Pseudo R-squared:                                                                0.999

   Prob:                                                                            0.000

------------------------------------------------------------------------------------------

nr Effect                                                               Coeff        s.e.

------------------------------------------------------------------------------------------

   count

     husb

1      mexican                                                          0.741**     0.128

2      oth hisp                                                        -0.860**     0.197

3      all others                                                      -0.687**     0.198

4      white                                                            2.395**     0.103

     wife

5      mexican                                                          1.571**     0.143

6      oth hisp                                                         0.037       0.208

7      all others                                                       0.310       0.208

8      white                                                            3.089**     0.121

     race_endog

9      1                                                                4.823**     0.258

10     2                                                                2.480**     0.232

11     3                                                                2.811**     0.186

12     4                                                                3.817**     0.177

13     5                                                                1.283**     0.175

     QS2

14     21                                                              -0.945**     0.235

15     52                                                              -0.402*      0.188

16     32                                                               0.690**     0.109

17     51                                                              -1.222**     0.202

18   _cons                                                              3.489**     0.258

------------------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  1.380085

         Prob > chi2(7)        =    0.9862

 

. *In this case backward and forward stepwise processes led to the same model.

*Backward stepwise means starting with the full model and throwing away insignificant terms

*forward stepwise means starting with some base model, and adding terms that are significant.

. exit, clear