Assignment 3¶

You may discuss homework problems with other students, but you have to prepare the written assignments yourself.

Please combine all your answers, the computer code and the figures into one PDF file submitting it to gradescope.

Grading scheme: 10 points per numbered problem, 20 for remaining problems.

Due date: February 18, 2022, 11:59PM.

Questions from Agresti¶

8.8
8.10
8.17
8.23
9.7
9.8
9.20
9.25
10.19

Logistic regression¶

For this problem use the zip.train and zip.test data sets that can be found in the ElemStatLearn package or here

Extract the 6’s and the 8’s from both zip.train and zip.test.
Fit a (logistic) LASSO and a ridge path predicting the class of the digits in your 6 vs. 8 data set. Finally, fit an elastic Net path with \(\alpha=0.5\).
Using the values of lambda.1se use the zip.test to evaluate which method is best for this prediction problem in terms of classification accuracy.

A deeper dive on LASSO for least squares¶

In class we saw that for the LASSO problem

\[ \text{minimize}_{\beta} \frac{1}{2} \|Y-X\beta\|^2_2 + \lambda \|\beta\|_1 \]

the KKT conditions were

\[ X'(Y-X\hat{\beta}) = \lambda \hat{u} \]

where \(\hat{u} \in \partial (\|\cdot\|_1)(\hat{\beta})\).

To make life / notation a little easier we’ve assumed the intercept is 0.

We wrote out these conditions when the selected variables were \(E\) with signs were \(s_E\). The conditions split up into two blocks, an active block and an inactive block. Explain these conditions in your own words.
Below, we use glmnet (which modifies the squared error by dividing by \(n\)) to solve the lasso on the prostate data from library(ElemStatLearn). Use the data (X,Y) to verify that beta.hat below satisfies the KKT conditions. (You will in the process construct a vector u.hat.) What should the u.hat vector look like?
Keeping the same variables and signs, describe how to construct responses \(Y\) that have the same variables selected and signs. Produce a response vector \(Y\) whose beta.hat has the same sparsity pattern for s=0.17 but has \(\hat{\beta}_{lcavol}=0.6\), \(\hat{\beta}_{lweight}=0.15\) and \(\hat{\beta}_{svi}=0.20\). Use glmnet to verify that with your response will yield such a solution. How would you ensure that your response vector \(Y\) had a specified value for \(\hat{u}_{-E}\)?

library(ElemStatLearn)
data(prostate)
library(glmnet)
X = model.matrix(lm(lpsa ~ lcavol + lweight + age + lbph + svi + lcp + gleason + pgg45, data=prostate))
X = scale(X, TRUE, TRUE)[, 2:ncol(X)]
Y = as.numeric(prostate$lpsa - mean(prostate$lpsa))
G = glmnet(X, Y, intercept=FALSE, standardize=FALSE)
beta.hat =  coef(G, s=0.17, exact=TRUE, x=X, y=Y)

Bonus (+20pts): a deeper dive LASSO for logistic regression¶

For logistic regression, the squared error loss is replaced with the logistic negative log-likelihood. The problem is

\[ \text{minimize}_{\beta} - \log L(\beta|Y,X) + \lambda \|\beta\|_1 \]

with KKT conditions

\[ X'(Y - \pi(\hat{\beta}))= \lambda \hat{u} \]