Sociology 382
Homework 2
Important ideas: independence, degrees of freedom, goodness of fit, odds ratio
Consider the 2 datasets linked from my website:
A) Occupation by
|
|
Race |
|
|
|
White |
Non White |
Occupational Class |
Other |
42,012 |
7,146 |
|
White Collar |
17,216 |
2,361 |
and
B) Intermarriage, LA 1990
|
Wives |
|
|
|
|
Husbands: |
NH Black |
Mexican |
Other Hisp |
All Others |
NH White |
Non Hisp Black |
4074 |
63 |
32 |
42 |
215 |
Mexican |
25 |
3947 |
143 |
95 |
1009 |
Other Hispanic |
16 |
132 |
239 |
18 |
304 |
All Others |
19 |
78 |
18 |
1022 |
360 |
Non Hisp White |
103 |
1156 |
373 |
492 |
28453 |
1) For dataset A, calculate the log odds ratio and the
standard error of the log odds ratio, using Excel. Is the log odds ratio significantly different
from zero? What does that mean about the
association between race and occupational class in
2) For dataset A, how is the log odds ratio for non-White representation in the White collar sector related to the log odds ratio for White representation in the White collar sector?
3) For BOTH dataset A and B, use excel
to generate the '
4) When non statisticians talk about over representation, and under representation, they frequently talk in terms of observed and expected percentages. Use the 'Independence Model' (see Question 3) to generate expected percentage of non-Whites, and Whites in White Collar jobs. Then divide observed percentage by the expected percentage to get a crude measure of over or under-representation. How can you compare the measure for Whites and non-Whites? Can you think of any reasons why this method is less satisfactory than the odds ratio method?
5) Use Stata to generate the "
6) Using Excel and dataset A, find the log odds ratio of
White representation in White Collar jobs, from the predicted values of the
7) Use Stata to generate the 'saturated' model for dataset A, which is simply the "