Least Squares Auto-Tuning

S. Barratt and S. Boyd

To appear, Engineering Optimization, 2020. First posted April 2019.

In least squares auto-tuning we automatically find hyper-parameters in least squares problems that minimize another (true) objective. The least squares tuning problem that we describe is nonconvex, so it cannot be efficiently solved. We present here a powerful proximal gradient method for least squares auto-tuning, which can be used to find good, if not the best, hyper-parameters for least squares problems. We describe in detail how our method could be applied to data fitting. Numerical experiments on a classification problem using the MNIST dataset demonstrate the effectiveness of the method; in this experiment we are able to cut the test error of standard least squares in half. The article is accompanied by an open source implementation.