You may discuss homework problems with other students, but you have to prepare the written assignments yourself. Late homework will be penalized 10% per day.

Please combine all your answers, the computer code and the figures into one file, and submit a copy in your dropbox on coursework.

Due March 19, 11:59PM.

Grading scheme: 10 points per question, total of 20.

# Question 1 (Based on RABE 12.3)¶

The O-rings in the booster rockets used in space launching play an important part in preventing rockets from exploding. Probabilities of O-ring failures are thought to be related to temperature. The data from 23 flights are given in this file

For each flight we have an indicator of whether or not any O-rings were damaged and the temperature of the launch.

1. Fit a logistic regression, modeling the probability of having any O-ring failures based on the temperature of the launch. Interpret the coefficients in terms of odds ratios.

2. From the fitted model, find the probability of an O-ring failure when the temperature at launch was 31 degrees. This was the temperature forecast for the day of the launching of the fatal Challenger flight on January 20, 1986.

3. Find an approximate 95% confidence interval for the coefficient of temperature in the logistic regression using both the summary and confint. Are the confidence intervals the same? Why or why not?

# Question 2 (Based on RABE 12.5)¶

Table 1.12 of the textbook describes variables in a study of health care in 52 health care facilities in New Mexico in the year 1988. The variables collected are:

Variable Description
RURAL Is hospital in a rural or non-rural area?
BED Number of beds in facility.
MCDAYS Annual medical in-patient days (hundreds).
TDAYS Annual total patient days (hundreds).
PCREV Annual total patient care revenue (\$100). NSAL Annual nursing salaries (\$100).
FEXP Annual facilities expenditures (\$100). NETREV PCREV - NSAL - FEXP 1. Using a logistic regression model, test the null hypothesis that the measured covariates have no power to distinguish between rural facilities and than non-rural facilities. Use level$\alpha=0.05\$.

2. Use a model selection technique based on AIC to choose a model that seems to best describe the outcome RURAL based on the measured covariates.

3. Repeat 2. but using BIC instead. Is the model the same?

4. Report estimates of the parameters for the variables in your final model. How are these to be interpreted?

5. Report confidence intervals for the parameters in 4. Do you think you can trust these intervals?

# Question 3¶

The data set below contains data on a sample of female horseshoe crabs, collecting their weight, width, a categorical variables for their color and the size of their spine. We are interested in understanding how the number of male satellites satell is predicted by these features.

1. Fit a log-linear Poisson regression model with satell as outcome and the remaining variables as predictors.

2. Use step to build a model in a forward fashion for satell starting with just an intercept.

3. Report estimates of the parameters for the variables in your final model. How are these to be interpreted?

4. Report confidence intervals for the parameters in 3. Do you think you can trust these intervals?

In [1]:
crabs = read.table('http://www.ics.uci.edu/~staceyah/111-202/data/horseshoe.txt', header=TRUE)