--------------------------------------------------------------------------------------------

       log:  C:\AAA Miker Files\newer web pages\soc_388_notes\Soc_388_2007\second_class_note

> s.log

  log type:  text

 opened on:  27 Sep 2007, 11:22:49

 

. edit

(1 var, 1 obs pasted into editor)

(1 var, 1 obs pasted into editor)

(1 var, 1 obs pasted into editor)

 

. edit

(3 vars, 4 obs pasted into editor)

- preserve

 

. edit

- preserve

 

. save "C:\AAA Miker Files\newer web pages\soc_388_notes\Soc_388_2007\frogs.dta"

file C:\AAA Miker Files\newer web pages\soc_388_notes\Soc_388_2007\frogs.dta saved

 

. exit, clear

-----------------------------------------------------------------------------------------------

       log:  C:\AAA Miker Files\newer web pages\soc_388_notes\Soc_388_2007\second_class_notes.log

  log type:  text

 opened on:  27 Sep 2007, 12:16:15

 

*Note (all my comments will start with an asterisk). I find it is always better to make stata logs in the .log format, which is a simple text format, rather than the .smcl format which is marked up and can only be read by a Stata-wise editor.

 

. *Here's a lesson for students: I quit stata and restarted and forgot to enable the log again, so here is my redo. The notes will not be word-for-word what I had in class because I have had to retype everything.

. use "C:\AAA Miker Files\newer web pages\soc_388_notes\Soc_388_2007\frogs.dta", clear

 

. *open the data set by copying into the stata data editor from excel, or by opening the data directly

. *first model, constant only

. poisson count

 

Iteration 0:   log likelihood = -14.328367 

Iteration 1:   log likelihood = -14.328367  (backed up)

 

Poisson regression                                Number of obs   =          4

                                                  LR chi2(0)      =      -0.00

                                                  Prob > chi2     =          .

Log likelihood = -14.328367                       Pseudo R2       =    -0.0000

 

------------------------------------------------------------------------------

       count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

       _cons |   2.931194   .1154701    25.38   0.000     2.704877    3.157511

------------------------------------------------------------------------------

 

. *one term in the model

. poisgof

 

         Goodness-of-fit chi2  =  9.822078

         Prob > chi2(3)        =    0.0201

 

. *3 terms left over for goodness of fit, a subject that I will be explaining more next class

. *where does this model fit the data?

. predict constant_only

(option n assumed; predicted number of events)

 

. table color live [iweight=count], row col

 

-------------------------------

          |        live       

    Color | Lilly  Water  Total

----------+--------------------

     Blue |    23     27     50

    Green |    10     15     25

          |

    Total |    33     42     75

-------------------------------

 

. table live color [iweight=count], row col

 

-------------------------------

          |        Color      

     live |  Blue  Green  Total

----------+--------------------

    Lilly |    23     10     33

    Water |    27     15     42

          |

    Total |    50     25     75

-------------------------------

 

. *That's our actual dataset

. *now the predicted values

. table live color [iweight= constant_only], row col

 

-------------------------------

          |        Color      

     live |  Blue  Green  Total

----------+--------------------

    Lilly | 18.75  18.75   37.5

    Water | 18.75  18.75   37.5

          |

    Total |  37.5   37.5     75

-------------------------------

 

. *one term, gets the total number of frogs right (75) and assumes every cell has the same count.

. *now on to the second model, which is the independence model

 

. set linesize 79

 

*Setting linesize is useful when using desmat because otherwise desmat will fill the results window.

 

. desmat: poisson count live color

-------------------------------------------------------------------------------

   Poisson regression

-------------------------------------------------------------------------------

   Dependent variable                                                    count

   Optimization:                                                            ml

   Number of observations:                                                   4

   Initial log likelihood:                                             -14.328

   Log likelihood:                                                      -9.540

   LR chi square:                                                        9.578

   Model degrees of freedom:                                                 2

   Pseudo R-squared:                                                     0.334

   Prob:                                                                 0.008

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   count

     live

1      Water                                                 0.241       0.233

     color

2      Green                                                -0.693**     0.245

3    _cons                                                   3.091**     0.192

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. poisgof

 

         Goodness-of-fit chi2  =  .2445188

         Prob > chi2(1)        =    0.6210

 

. *this model uses 3 terms, has 1 df left over

. predict independence_model

(option n assumed; predicted number of events)

 

. table live color [iweight= count], row col

 

-------------------------------

          |        Color      

     live |  Blue  Green  Total

----------+--------------------

    Lilly |    23     10     33

    Water |    27     15     42

          |

    Total |    50     25     75

-------------------------------

 

. table live color [iweight= independence_model], row col

 

-------------------------------

          |        Color      

     live |  Blue  Green  Total

----------+--------------------

    Lilly |    22     11     33

    Water |    28     14     42

          |

    Total |    50     25     75

-------------------------------

 

. *these are the predicted values of our independence model, just as we calculated them by hand in Excel. Note that the model fits the marginal distributions of color and live exactly.

. *note also that the independence model is fundamentally a multiplicative model. Loglinear models have a multiplicative interpretation always present, see my notes.

. *final model:

. desmat: poisson count color*live

-------------------------------------------------------------------------------

   Poisson regression

-------------------------------------------------------------------------------

   Dependent variable                                                    count

   Optimization:                                                            ml

   Number of observations:                                                   4

   Initial log likelihood:                                             -14.328

   Log likelihood:                                                      -9.417

   LR chi square:                                                        9.822

   Model degrees of freedom:                                                 3

   Pseudo R-squared:                                                     0.343

   Prob:                                                                 0.020

-------------------------------------------------------------------------------

nr Effect                                                    Coeff        s.e.

-------------------------------------------------------------------------------

   count

     color

1      Green                                                -0.833*      0.379

     live

2      Water                                                 0.160       0.284

     color.live

3      Green.Water                                           0.245       0.497

4    _cons                                                   3.135**     0.209

-------------------------------------------------------------------------------

*  p < .05

** p < .01

 

. *Key point: the interaction term is the log odds ratio (.245) with standard error that we calculated by hand in excel.

. *note that Stata has a built-in way to generate dummy variables, xi, which for simple models works just as well as desmat

. xi: poisson count i.color*i.live

i.color           _Icolor_1-2         (_Icolor_1 for color==Blue omitted)

i.live            _Ilive_1-2          (_Ilive_1 for live==Lilly omitted)

i.color*i.live    _IcolXliv_#_#       (coded as above)

 

Iteration 0:   log likelihood =  -9.417463 

Iteration 1:   log likelihood = -9.4173319 

Iteration 2:   log likelihood = -9.4173319 

 

Poisson regression                                Number of obs   =          4

                                                  LR chi2(3)      =       9.82

                                                  Prob > chi2     =     0.0201

Log likelihood = -9.4173319                       Pseudo R2       =     0.3427

 

------------------------------------------------------------------------------

       count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------

   _Icolor_2 |  -.8329091   .3787852    -2.20   0.028    -1.575315   -.0905037

    _Ilive_2 |   .1603427   .2837522     0.57   0.572    -.3958014    .7164867

_IcolXliv_~2 |   .2451225    .497174     0.49   0.622    -.7293206    1.219566

       _cons |   3.135494   .2085144    15.04   0.000     2.726813    3.544175

------------------------------------------------------------------------------

 

. poisgof

 

         Goodness-of-fit chi2  =  7.95e-06

         Prob > chi2(0)        =         .

 

. *same interaction term, same model..

. *what do the dummy variables look like? take a look at the data browser...

. *The log will be automatically updated and saved as you work (if you are smart enough to remember to open it), whereas the dataset will only be saved if you choose to save it.

. *here i have made additions to the frog dataset that I don't particularly care to save.

. exit, clear