-----------------------------------------------------------------------------------------------------

      name:  <unnamed>

       log:  C:\Users\mexmi\Documents\newer web pages\soc_382_stuff\logs\loglin w 5by5 intermar.log

  log type:  text

 opened on:  29 Jan 2019, 10:18:10

 

. table meth_num feth_num, contents (sum count) row col cellwidth(10)

 

--------------------------------------------------------------------------------------------

husband's           |                         wife's race/ethnicity                        

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non        Total

--------------------+-----------------------------------------------------------------------

Black, non Hispanic |      42521         291         412         393        2064       45681

   Mexican American |         94       18088         612         433        6067       25294

     Hispanic Other |        310         633        5901         258        4507       11609

 Non Hispanic Other |        101         317         214        3509        3959        8100

 White non Hispanic |        615        5338        4403        5505      543276      559137

                    |

              Total |      43641       24667       11542       10098      559873      649821

--------------------------------------------------------------------------------------------

 

* The data represent national US marriage cross-classification data, over 3 censuses: 1970, 1980 and 1990. So the sample size is huge and the power to distinguish between competing hypotheses is very high.

 

1) The very unappealing constant only model.

 

. poisson count

 

Iteration 0:   log likelihood = -1576481.7 

Iteration 1:   log likelihood = -1576481.7 

 

Poisson regression                                Number of obs   =         25

                                                  LR chi2(0)      =      -0.00

                                                  Prob > chi2     =          .

Log likelihood = -1576481.7                       Pseudo R2       =    -0.0000

 

------------------------------------------------------------------------------

       count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

       _cons |   10.16558   .0012405  8194.62   0.000     10.16315    10.16801

------------------------------------------------------------------------------

 

. display ln(649821/25)

10.165576

 

. predict constant_only_class

(option n assumed; predicted number of events)

 

. table meth_num feth_num, contents (sum count sum constant_only_class ) row col cellwidth(10)

 

--------------------------------------------------------------------------------------------

husband's           |                         wife's race/ethnicity                        

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non        Total

--------------------+-----------------------------------------------------------------------

Black, non Hispanic |      42521         291         412         393        2064       45681

                    |   25992.84    25992.84    25992.84    25992.84    25992.84    129964.2

                    |

   Mexican American |         94       18088         612         433        6067       25294

                    |   25992.84    25992.84    25992.84    25992.84    25992.84    129964.2

                    |

     Hispanic Other |        310         633        5901         258        4507       11609

                    |   25992.84    25992.84    25992.84    25992.84    25992.84    129964.2

                    |

 Non Hispanic Other |        101         317         214        3509        3959        8100

                    |   25992.84    25992.84    25992.84    25992.84    25992.84    129964.2

                    |

 White non Hispanic |        615        5338        4403        5505      543276      559137

                    |   25992.84    25992.84    25992.84    25992.84    25992.84    129964.2

                    |

              Total |      43641       24667       11542       10098      559873      649821

                    |   129964.2    129964.2    129964.2    129964.2    129964.2      649821

--------------------------------------------------------------------------------------------

 

*Constant only model fits the data in only one place: the total count of 649K.

 

. poisgof

 

         Deviance goodness-of-fit =   3152733

         Prob > chi2(24)          =    0.0000

 

         Pearson goodness-of-fit  =  1.08e+07

         Prob > chi2(24)          =    0.0000

 

* The constant only model fits terribly, as indicated by the fits next to the actual data above, and by the goodness of fit chisquare statistic. The null hypothesis here is that the constant model and the saturated model (i.e. the actual data) fit equally well. This null hypothesis is resoundingly rejected. Note also that the likelihood ratio chisquare test and the Pearson chisquare test yield the same substantive answer (rejection of the null hypothesis), but the actual statistics are 3X different. I would guess that the poor fit of the model makes one or both of the statistics perform poorly.

 

. gen ID_const_class=(50/649821)*(abs(count- constant_only_class ))

*The formula for generating cell-by-cell ID score: (50/N)*(abs(actual-predicted))

 

. table meth_num feth_num, contents (sum ID_const_class ) row col cellwidth(10)

 

--------------------------------------------------------------------------------------------

husband's           |                         wife's race/ethnicity                        

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non        Total

--------------------+-----------------------------------------------------------------------

Black, non Hispanic |   1.271747    1.977609    1.968299    1.969761    1.841187    9.028603

   Mexican American |   1.992767    .6082321     1.95291    1.966683    1.533179    8.053772

     Hispanic Other |   1.976147    1.951294    1.545952    1.980148    1.653212    9.106754

 Non Hispanic Other |   1.992229    1.975609    1.983534    1.730003    1.695378    9.376751

 White non Hispanic |   1.952679    1.589272    1.661214    1.576422    39.80197    46.58156

                    |

              Total |    9.18557    8.102016    9.111909    9.223017    46.52493    82.14744

--------------------------------------------------------------------------------------------

*Tabling the score yields the ID statistic sum of 82, meaning 82% of the cases are misclassified, i.e. in the wrong cell.

 

 

. drop ID_indep ID_constant ID_dichot_endog ID_quasi_indep

 

2) The independence model:

 

. poisson count i.meth_num i.feth_num

 

Iteration 0:   log likelihood = -300152.18 

Iteration 1:   log likelihood = -228156.87 

Iteration 2:   log likelihood = -225010.67 

Iteration 3:   log likelihood = -225002.47 

Iteration 4:   log likelihood = -225002.47 

 

Poisson regression                                Number of obs   =         25

                                                  LR chi2(8)      = 2702958.36

                                                  Prob > chi2     =     0.0000

Log likelihood = -225002.47                       Pseudo R2       =     0.8573

 

-------------------------------------------------------------------------------------

              count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

--------------------+----------------------------------------------------------------

           meth_num |

  Mexican American  |  -.5911152   .0078375   -75.42   0.000    -.6064764   -.5757541

    Hispanic Other  |  -1.369902   .0103938  -131.80   0.000    -1.390273    -1.34953

Non Hispanic Other  |  -1.729818    .012056  -143.48   0.000    -1.753448   -1.706189

White non Hispanic  |   2.504712   .0048661   514.72   0.000     2.495175     2.51425

                    |

           feth_num |

  Mexican American  |  -.5705308   .0079658   -71.62   0.000    -.5861435    -.554918

    Hispanic Other  |  -1.330005   .0104668  -127.07   0.000    -1.350519    -1.30949

Non Hispanic Other  |   -1.46366   .0110428  -132.54   0.000    -1.485303   -1.442016

White non Hispanic  |   2.551713   .0049699   513.43   0.000     2.541972    2.561454

                    |

              _cons |   8.028738   .0065777  1220.60   0.000     8.015846     8.04163

-------------------------------------------------------------------------------------

 

. poisgof

 

         Deviance goodness-of-fit =  449774.9

         Prob > chi2(16)          =    0.0000

 

         Pearson goodness-of-fit  =   1176372

         Prob > chi2(16)          =    0.0000

 

. predict indep_model_class

(option n assumed; predicted number of events)

 

. table meth_num feth_num, contents (sum count sum indep_model_class ) row col cellwidth(10)

 

--------------------------------------------------------------------------------------------

husband's           |                         wife's race/ethnicity                        

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non        Total

--------------------+-----------------------------------------------------------------------

Black, non Hispanic |      42521         291         412         393        2064       45681

                    |   3067.867    1734.036    811.3774    709.8674    39357.85       45681

                    |

   Mexican American |         94       18088         612         433        6067       25294

                    |   1698.707    960.1523    449.2673    393.0603    21792.81       25294

                    |

     Hispanic Other |        310         633        5901         258        4507       11609

                    |   779.6429     440.674    206.1969       180.4    10002.09       11609

                    |

 Non Hispanic Other |        101         317         214        3509        3959        8100

                    |   543.9838    307.4734    143.8707    125.8713    6978.801        8100

                    |

 White non Hispanic |        615        5338        4403        5505      543276      559137

                    |    37550.8    21224.66    9931.288    8688.801    481741.4      559137

                    |

              Total |      43641       24667       11542       10098      559873      649821

                    |      43641       24667       11542       10098      559873      649821

--------------------------------------------------------------------------------------------

* Independence model fits the marginals, that is the row and column totals, and therefore also the grand total.

 

 

 

. gen ID_indep_class=(50/649821)*(abs(count- indep_model_class ))

 

. table meth_num feth_num, contents (sum ID_indep_class ) row col cellwidth(10)

 

--------------------------------------------------------------------------------------------

husband's           |                         wife's race/ethnicity                        

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non        Total

--------------------+-----------------------------------------------------------------------

Black, non Hispanic |   3.035692    .1110334    .0307298    .0243811    2.869548    6.071385

   Mexican American |    .123473     1.31789    .0125213    .0030731    1.210011    2.666968

     Hispanic Other |   .0361363    .0147984    .4381824    .0059709    .4228154    .9179034

 Non Hispanic Other |   .0340851     .000733     .005396    .2603123    .2323564    .5328828

 White non Hispanic |   2.841998    1.222388    .4253701    .2449752    4.734732    9.469463

                    |

              Total |   6.071385    2.666842    .9121997    .5387127    9.469463     19.6586

--------------------------------------------------------------------------------------------

 

. gen byte endogamy_diagonal=0

 

. replace endogamy_diagonal=1 if meth_num== feth_num

(5 real changes made)

 

. table meth_num feth_num, contents (mean endogamy_diagonal ) cellwidth(10)

 

note: cellwidth too small, variable name truncated;

      to increase cellwidth, specify cellwidth(#)

 

--------------------------------------------------------------------------------

husband's           |                   wife's race/ethnicity                  

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non

--------------------+-----------------------------------------------------------

Black, non Hispanic |          1           0           0           0           0

   Mexican American |          0           1           0           0           0

     Hispanic Other |          0           0           1           0           0

 Non Hispanic Other |          0           0           0           1           0

 White non Hispanic |          0           0           0           0           1

--------------------------------------------------------------------------------

 

3) Adding one term for the endogamy diagonal

 

. poisson count i.meth_num i.feth_num endogamy_diagonal

 

 

Poisson regression                                Number of obs   =         25

                                                  LR chi2(9)      = 3129502.75

                                                  Prob > chi2     =     0.0000

Log likelihood =  -11730.28                       Pseudo R2       =     0.9926

 

-------------------------------------------------------------------------------------

              count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

--------------------+----------------------------------------------------------------

           meth_num |

  Mexican American  |   -.391273    .011867   -32.97   0.000    -.4145319    -.368014

    Hispanic Other  |  -.8877092   .0141065   -62.93   0.000    -.9153574    -.860061

Non Hispanic Other  |  -1.321856   .0156512   -84.46   0.000    -1.352532    -1.29118

White non Hispanic  |   1.233632    .008495   145.22   0.000     1.216982    1.250282

                    |

           feth_num |

  Mexican American  |  -.2689479   .0120762   -22.27   0.000    -.2926167    -.245279

    Hispanic Other  |  -.6892557   .0142621   -48.33   0.000    -.7172089   -.6613025

Non Hispanic Other  |  -.5729032   .0146367   -39.14   0.000    -.6015907   -.5442157

White non Hispanic  |   1.467609   .0086021   170.61   0.000     1.450749    1.484469

                    |

  endogamy_diagonal |   3.208822   .0058359   549.84   0.000     3.197384     3.22026

              _cons |   7.298074   .0068811  1060.60   0.000     7.284587    7.311561

-------------------------------------------------------------------------------------

 

. display exp(3.2088)

24.749369

 

* How to interpret the coefficients. For the endogamy diagonal, having the same race as one’s spouse increases the log count by a highly significant 3.2. If we exponentiate, being on the endogamy diagonal increases the count by a factor of 24.7, or in Stata language by an Incident Rate Ratio of 24.7.

 

 

. poisgof

 

         Deviance goodness-of-fit =  23230.56

         Prob > chi2(15)          =    0.0000

 

         Pearson goodness-of-fit  =  20603.22

         Prob > chi2(15)          =    0.0000

 

. gen byte endogamy_diagonal_cat=0

 

. replace endogamy_diagonal_cat= meth_num if meth_num== feth_num

(5 real changes made)

 

. predict endogamy_dichotomous

(option n assumed; predicted number of events)

 

 

. gen ID_endogamy_dichotomous=(50/649821)*(abs(count- endogamy_dichotomous ))

 

. table meth_num feth_num, contents (sum ID_endogamy_dichotomous ) row col cellwidth(10)

 

--------------------------------------------------------------------------------------------

husband's           |                         wife's race/ethnicity                        

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non        Total

--------------------+-----------------------------------------------------------------------

Black, non Hispanic |   .4581418    .0644826    .0253613    .0338643    .3344336    .9162835

   Mexican American |   .0696381    .0621266    .0085046    .0100295    .1332898    .2835886

     Hispanic Other |   .0229383    .0129488    .1272461    .0065332    .1437687    .3134351

 Non Hispanic Other |   .0225406    .0012274     .001251    .1530409    .1731031     .351163

 White non Hispanic |   .3430248    .1124331    .1428519    .2034678     .115729    .9175065

                    |

              Total |   .9162836    .2532186    .3052149    .4069357    .9003242    2.781977

--------------------------------------------------------------------------------------------

 *Fitting better by ID now, only misclassifying 2.8% of all cases.

 

4) Quasi-Independence, or independence plus a separate term for each cell on the endogamy diagonal.

 

. poisson count i.meth_num i.feth_num i.endogamy_diagonal_cat

 

 

Poisson regression                                Number of obs   =         25

                                                  LR chi2(13)     = 3151666.22

                                                  Prob > chi2     =     0.0000

Log likelihood = -648.54411                       Pseudo R2       =     0.9996

 

---------------------------------------------------------------------------------------

                count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

----------------------+----------------------------------------------------------------

             meth_num |

    Mexican American  |    .899968   .0213908    42.07   0.000     .8580427    .9418932

      Hispanic Other  |   .6519274   .0222121    29.35   0.000     .6083924    .6954623

  Non Hispanic Other  |   .4459202   .0231596    19.25   0.000     .4005282    .4913121

  White non Hispanic  |   2.971213   .0235674   126.07   0.000     2.925022    3.017404

                      |

             feth_num |

    Mexican American  |   1.829598    .032363    56.53   0.000     1.766168    1.893028

      Hispanic Other  |   1.653511   .0327399    50.50   0.000     1.589342     1.71768

  Non Hispanic Other  |   1.794394   .0323429    55.48   0.000     1.731004    1.857785

  White non Hispanic  |   3.995454   .0336881   118.60   0.000     3.929427    4.061482

                      |

endogamy_diagonal_cat |

                   1  |   6.873631   .0370393   185.58   0.000     6.801036    6.946227

                   2  |   3.289316   .0229991   143.02   0.000     3.244239    3.334393

                   3  |   2.593316   .0262895    98.64   0.000      2.54179    2.644843

                   4  |    2.13865   .0286843    74.56   0.000      2.08243     2.19487

                   5  |   2.454583   .0192917   127.24   0.000     2.416772    2.492394

                      |

                _cons |   3.784122   .0367204   103.05   0.000     3.712151    3.856093

---------------------------------------------------------------------------------------

 

* Note how different the endogamy diagonal terms are. Non-Hispanic blacks (category 1) are the most endogamous, the most likely to be married to someone from the same group. Below we test the difference between two of the above endogamy diagonal terms, and the difference is highly significant, which it should be. Adding 4 terms to model 3 improved our goodness of fit by 22,000 on 4 df.

 

. poisgof

 

         Deviance goodness-of-fit =  1067.088

         Prob > chi2(11)          =    0.0000

 

         Pearson goodness-of-fit  =  1294.682

         Prob > chi2(11)          =    0.0000

 

. table meth_num feth_num, contents (mean endogamy_diagonal_cat ) cellwidth(10)

 

note: cellwidth too small, variable name truncated;

      to increase cellwidth, specify cellwidth(#)

 

--------------------------------------------------------------------------------

husband's           |                   wife's race/ethnicity                  

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non

--------------------+-----------------------------------------------------------

Black, non Hispanic |          1           0           0           0           0

   Mexican American |          0           2           0           0           0

     Hispanic Other |          0           0           3           0           0

 Non Hispanic Other |          0           0           0           4           0

 White non Hispanic |          0           0           0           0           5

--------------------------------------------------------------------------------

 

. codebook meth_num

 

-----------------------------------------------------------------------------------------------------

meth_num                                                                     husband's race/ethnicity

-----------------------------------------------------------------------------------------------------

 

                  type:  numeric (byte)

                 label:  ethnicity

 

                 range:  [1,5]                        units:  1

         unique values:  5                        missing .:  0/25

 

            tabulation:  Freq.   Numeric  Label

                             5         1  Black, non Hispanic

                             5         2  Mexican American

                             5         3  Hispanic Other

                             5         4  Non Hispanic Other

                             5         5  White non Hispanic

 

 

. test 2.endogamy_diagonal_cat-5.endogamy_diagonal_cat=0

 

 ( 1)  [count]2.endogamy_diagonal_cat - [count]5.endogamy_diagonal_cat = 0

 

           chi2(  1) =  492.45

         Prob > chi2 =    0.0000

 

. lincom 2.endogamy_diagonal_cat-5.endogamy_diagonal_cat

 

 ( 1)  [count]2.endogamy_diagonal_cat - [count]5.endogamy_diagonal_cat = 0

 

------------------------------------------------------------------------------

       count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

         (1) |   .8347328   .0376155    22.19   0.000     .7610076    .9084579

------------------------------------------------------------------------------

 

. predict quasi_indep_model

(option n assumed; predicted number of events)

 

. gen ID_quasi_indep=(50/649821)*(abs(count- quasi_indep_model ))

 

. table meth_num feth_num, contents (sum ID_quasi_indep ) row col cellwidth(10)

 

--------------------------------------------------------------------------------------------

husband's           |                         wife's race/ethnicity                        

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non        Total

--------------------+-----------------------------------------------------------------------

Black, non Hispanic |          0    .0012956    .0140117    .0098736    .0251809    .0503618

   Mexican American |   .0010935           0    .0035827    .0167726    .0142835    .0357322

     Hispanic Other |   .0173555     .008219           0    .0192346    .0063399     .051149

 Non Hispanic Other |   .0024838    .0085578    .0111633           0    .0172374    .0394423

 White non Hispanic |   .0187457    .0009568    .0064311    .0261336           0    .0522672

                    |

              Total |   .0396785    .0190292    .0351888    .0720144    .0630417    .2289526

--------------------------------------------------------------------------------------------

* Now we are down to an ID of 0.2% This model fits very well, but because sample size is so large, the likelihood ratio test for goodness of fit still rejects it. We need more terms to fit the off diagonal cells.

 

 

. table meth_num feth_num, contents (sum count sum quasi_indep_model ) row col cellwidth(10)

 

--------------------------------------------------------------------------------------------

husband's           |                         wife's race/ethnicity                        

race/ethnicity      | Black, non  Mexican Am  Hispanic O  Non Hispan  White non        Total

--------------------+-----------------------------------------------------------------------

Black, non Hispanic |      42521         291         412         393        2064       45681

                    |      42521    274.1622    229.8974    264.6786    2391.262       45681

                    |

   Mexican American |         94       18088         612         433        6067       25294

                    |   108.2118       18088    565.4383    650.9836    5881.366       25294

                    |

     Hispanic Other |        310         633        5901         258        4507       11609

                    |   84.44069    526.1821        5901    507.9809    4589.396       11609

                    |

 Non Hispanic Other |        101         317         214        3509        3959        8100

                    |   68.72013    428.2213    359.0829        3509    3734.976        8100

                    |

 White non Hispanic |        615        5338        4403        5505      543276      559137

                    |   858.6274    5350.435    4486.582    5165.357      543276      559137

                    |

              Total |      43641       24667       11542       10098      559873      649821

                    |      43641       24667       11542       10098      559873      649821

--------------------------------------------------------------------------------------------

 

. save "C:\Users\mexmi\Downloads\five cat intermar data US 3 decades.dta", replace

file C:\Users\mexmi\Downloads\five cat intermar data US 3 decades.dta saved

 

. log close

      name:  <unnamed>

       log:  C:\Users\mexmi\Documents\newer web pages\soc_382_stuff\logs\loglin w 5by5 intermar.log

  log type:  text

 closed on:  29 Jan 2019, 16:40:04

-----------------------------------------------------------------------------------------------------