-------------------------------------------------------------------------------
log: C:\AAA Miker Files\newer web
pages\soc_388_notes\soc_388_2003\clas
> s 14.log
log type: text
opened on:
. *We're going to look at some exact tests for odds
ratios and independence
.
. csi 23 10 27 16, or
| Exposed Unexposed
| Total
-----------------+------------------------+----------
Cases
| 23 10
| 33
Noncases
| 27 16
| 43
-----------------+------------------------+----------
Total
| 50 26
| 76
| |
Risk
| .46 .3846154
| .4342105
| |
| Point estimate |
[95% Conf. Interval]
|------------------------+----------------------
Risk difference
| .0753846 | -.1571114 .3078807
Risk ratio
| 1.196 |
.6753688 2.117978
Attr. frac. ex.
| .1638796 | -.4806724 .5278515
Attr. frac. pop
| .1142191 |
Odds ratio
| 1.362963 |
.5249006 3.529677 (Cornfield)
+-----------------------------------------------
chi2(1) = 0.40
Pr>chi2 = 0.5293
. csi 23 10 27 15, or
| Exposed Unexposed
| Total
-----------------+------------------------+----------
Cases
| 23 10
| 33
Noncases
| 27 15
| 42
-----------------+------------------------+----------
Total
| 50 25
| 75
| |
Risk
| .46 .4
| .44
| |
| Point estimate |
[95% Conf. Interval]
|------------------------+----------------------
Risk difference
| .06 | -.1765637 .2965637
Risk ratio
| 1.15 |
.652775 2.025966
Attr. frac. ex.
| .1304348 | -.5319213 .5064083
Attr. frac. pop
| .0909091 |
Odds ratio | 1.277778 |
.4882768 3.335527 (Cornfield)
+-----------------------------------------------
chi2(1) = 0.24
Pr>chi2 = 0.6217
. *That's the frog data.
. csi 23 10 27 15, or exact
| Exposed Unexposed
| Total
-----------------+------------------------+----------
Cases
| 23 10
| 33
Noncases
| 27 15
| 42
-----------------+------------------------+----------
Total
| 50 25
| 75
| |
Risk
| .46 .4
| .44
| |
| Point estimate |
[95% Conf. Interval]
|------------------------+----------------------
Risk difference
| .06 | -.1765637 .2965637
Risk ratio
| 1.15 |
.652775 2.025966
Attr. frac. ex.
| .1304348 | -.5319213 .5064083
Attr. frac. pop
| .0909091 |
Odds ratio
| 1.277778 |
.4882768 3.335527 (Cornfield)
+-----------------------------------------------
1-sided Fisher's
exact P = 0.4040
2-sided
Fisher's exact P = 0.8055
. csi 3 4 5 6, or
| Exposed Unexposed
| Total
-----------------+------------------------+----------
Cases
| 3 4
| 7
Noncases
| 5 6
| 11
-----------------+------------------------+----------
Total
| 8 10
| 18
| |
Risk
| .375 .4
| .3888889
| |
| Point estimate |
[95% Conf. Interval]
|------------------------+----------------------
Risk difference
| -.025 | -.4774796 .4274796
Risk ratio
| .9375 |
.290024 3.03046
Prev. frac. ex.
| .0625 |
-2.03046 .709976
Prev. frac. pop
| .0277778 |
Odds ratio
| .9 |
.146097 5.647991 (Cornfield)
+-----------------------------------------------
chi2(1) = 0.01
Pr>chi2 = 0.9139
. csi 3 4 5 6, or exact
| Exposed Unexposed
| Total
-----------------+------------------------+----------
Cases
| 3 4
| 7
Noncases
| 5 6
| 11
-----------------+------------------------+----------
Total | 8 10
| 18
| |
Risk
| .375 .4
| .3888889
| |
| Point estimate |
[95% Conf. Interval]
|------------------------+----------------------
Risk difference
| -.025 | -.4774796 .4274796
Risk ratio
| .9375 |
.290024 3.03046
Prev. frac. ex.
| .0625 |
-2.03046 .709976
Prev. frac. pop
| .0277778 |
Odds ratio
| .9 |
.146097 5.647991 (Cornfield)
+-----------------------------------------------
1-sided
Fisher's exact P = 0.6478
2-sided
Fisher's exact P = 1.0000
. csi 1000 4 5 6, or
| Exposed Unexposed
| Total
-----------------+------------------------+----------
Cases
| 1000 4
| 1004
Noncases
| 5 6
| 11
-----------------+------------------------+----------
Total
| 1005 10
| 1015
| |
Risk
| .9950249 .4
| .9891626
| |
| Point estimate |
[95% Conf. Interval]
|------------------------+----------------------
Risk difference
| .5950249 |
.2913574 .8986923
Risk ratio
| 2.487562 |
1.164393 5.314328
Attr. frac. ex.
| .598 |
.1411833 .8118295
Attr. frac. pop
| .5956175 |
Odds ratio
| 300 |
68.40492 1337.21 (Cornfield)
+-----------------------------------------------
chi2(1) = 327.02
Pr>chi2 = 0.0000
. csi 1000 4 5 6, or exact
| Exposed Unexposed
| Total
-----------------+------------------------+----------
Cases
| 1000 4
| 1004
Noncases
| 5 6
| 11
-----------------+------------------------+----------
Total
| 1005 10
| 1015
| |
Risk
| .9950249 .4
| .9891626
| |
| Point estimate |
[95% Conf. Interval]
|------------------------+----------------------
Risk difference
| .5950249 |
.2913574 .8986923
Risk ratio
| 2.487562 |
1.164393 5.314328
Attr. frac. ex.
| .598 |
.1411833 .8118295
Attr. frac. pop
| .5956175 |
Odds ratio
| 300 |
68.40492 1337.21 (Cornfield)
+-----------------------------------------------
1-sided
Fisher's exact P = 0.0000
2-sided
Fisher's exact P = 0.0000
. csi 42012 7146 17216 2361, or
| Exposed Unexposed
| Total
-----------------+------------------------+----------
Cases
| 42012 7146
| 49158
Noncases
| 17216 2361
| 19577
-----------------+------------------------+----------
Total
| 59228 9507
| 68735
| |
Risk
| .7093267 .7516567
| .7151815
| |
| Point estimate |
[95% Conf. Interval]
|------------------------+----------------------
Risk difference
| -.04233 | -.0517533 -.0329067
Risk ratio
| .9436844 |
.9318199 .9557
Prev. frac. ex.
| .0563156 |
.0443 .0681801
Prev. frac. pop
| .0485264 |
Odds ratio
| .8062581 |
.7670994 .8474158 (Cornfield)
+-----------------------------------------------
chi2(1) = 72.06
Pr>chi2 = 0.0000
. *OK, that's enough for exact tests for the moment.
. use "C:\AAA Miker Files\newer web pages\soc_388_notes\ed
intermar.dta", clear
. *This is the educational intermarriage data we have
seen before.
. tabulate hed wed [fweight=count]
| wed
hed | 1 2 3 4 | Total
-----------+--------------------------------------------+----------
1 | 32,016
33,374 8,407 988 |
74,785
2 | 28,370
137,876 43,783 8,446 |
218,475
3 | 7,051
48,766 61,633 18,195 |
135,645
4 | 984
13,794 28,635 51,224 |
94,637
-----------+--------------------------------------------+----------
Total | 68,421
233,810 142,458 78,853 |
523,542
. desmat: poisson count hed wed
-------------------------------------------------------------------------------
Poisson
regression
--------------------------------------------------------------------------------
Dependent
variable count
Optimization:
ml
Number of
observations:
16
Initial log
likelihood:
-221501.223
Log
likelihood:
-113882.425
LR chi
square:
215237.595
Model degrees of
freedom:
6
Pseudo
R-squared:
0.486
Prob:
0.000
------------------------------------------------------------------------------------------
nr Effect
Coeff s.e.
------------------------------------------------------------------------------------------
count
hed
1 2
1.072** 0.004
2 3
0.595** 0.005
3 4 0.235** 0.005
wed
4 2
1.229** 0.004
5 3
0.733** 0.005
6 4
0.142** 0.005
7 _cons
9.187** 0.005
------------------------------------------------------------------------------------------
* p < .05
** p < .01
. poisgof, pearson
Goodness-of-fit chi2 = 257372
Prob >
chi2(9) = 0.0000
. *This is the independence model
. predict indep_count
(option n assumed; predicted number of events)
. table hed wed, contents (sum count sum indep_count) row col
------------------------------------------------------------
| wed
hed | 1 2 3 4
Total
----------+-------------------------------------------------
1 | 32016
33374 8407 988
74785
|
9773.551 33398.43 20349.32
11263.7 74785
|
2 | 28370
137876 43783 8446
218475
| 28552.2
97569.33 59447.98 32905.5
218475
|
3 | 7051
48766 61633 18195
135645
|
17727.26 60578.06 36909.58
20430.1 135645
|
4 | 984
13794 28635 51224
94637
|
12367.98 42264.19 25751.13
14253.7 94637
|
Total | 68421
233810 142458 78853
523542
| 68421
233810 142458 78853
523542
------------------------------------------------------------
. *residuals are simply observed minus expected
. gen indep_residuals=count- indep_count
. table hed wed, contents (sum indep_residuals) row col
-----------------------------------------------------------------
| wed
hed | 1 2 3 4
Total
----------+------------------------------------------------------
1 | 22242.45
-24.42969 -11942.32 -10275.7 0
2 |
-182.2031 40306.67 -15664.98
-24459.5 -.0039063
3 |
-10676.26 -11812.06 24723.42
-2235.1 -.0019531
4 |
-11383.98 -28470.19 2883.871
36970.3 -.0019531
|
Total | .0019531
-.0039063 -.0039063 -.0019531
-.0078125
-----------------------------------------------------------------
. *for all practical purposes, the sum of residuals (row
sums, col sums, and total sum) is
> zero.
. *And it has to be that way.
. *one way of thinking about the expected standard
deviation of counts in each cell is squ
> are root of the expected count.
. *gen sqrt_indep_count= indep_count^.5
. gen sqrt_indep_count= indep_count^.5
. gen std_resid= indep_residuals/ sqrt_indep_count
. table hed wed, contents (sum std_resid) row col
-----------------------------------------------------------------
| wed
hed | 1 2 3 4
Total
----------+------------------------------------------------------
1 | 224.9865
-.1336765 -83.717 -96.82131
44.31449
2 |
-1.078291 129.0388 -64.24824
-134.8383 -71.12604
3 |
-80.18597 -47.9919 128.6883
-15.6373 -15.1269
4 |
-102.3634 -138.4854 17.97123
309.6629 86.78525
|
Total | 41.3588
-57.57221 -1.305752 62.36596
44.8468
-----------------------------------------------------------------
. *Standardized residuals (and there are sevaral more
sophisticated ways of standardizing)
> tell you
which cells are the most far away from actual counts.
. gen std_resid_squared= std_resid^2
. table hed wed, contents (sum std_resid_squared) row col
------------------------------------------------------------
| wed
hed | 1 2 3 4
Total
----------+-------------------------------------------------
1 |
50618.92 .0178694 7008.537
9374.366 67001.84
2 |
1.162712 16651.01 4127.836
18181.37 38961.37
3 |
6429.789 2303.222 16560.67
244.525 25538.21
4 |
10478.27 19178.21 322.965
95891.09 125870.5
|
Total |
67528.14 38132.46 28020.01
123691.4 257372
------------------------------------------------------------
. *The sum of the squared residuals or the square of the
pearson residuals, gives you pear
> son chisquare statistics. This can tell you how much each cell contributes
to the lack
> of good fit.
. save "C:\AAA Miker Files\newer web
pages\soc_388_notes\ed intermar.dta", replace
file C:\AAA Miker Files\newer web pages\soc_388_notes\ed
intermar.dta saved
. *Note that the sum of the squared pearson residuals,
257,372, is the pearson goodness of
> fit statistic
for this model.
. exit, clear