Bayesian Optimization and other Bad Ideas for Hyperparameter Optimization

Kevin Jamieson
Postdoctoral fellow, UC Berkeley
Date: Jan. 20th, 2017

Abstract

The performance of machine learning systems depends critically on tuning parameters that are difficult to set by standard optimization techniques. Such “hyperparameters”---including model architecture, regularization, and learning rates---are often tuned in an outer loop by black-box search methods evaluating performance on a holdout set. We formulate such hyperparameter tuning as a pure-exploration problem of deciding how many resources should be allocated to particular hyperparameter configurations. I will introduce our Hyperband algorithm for this framework and a theoretical analysis that demonstrates its ability to adapt to uncertain convergence rates and the dependency of hyperparameters on the validation loss. I will close with several experimental validations of Hyperband, including experiments on training deep networks where Hyperband outperforms state-of-the-art Bayesian optimization methods by an order of magnitude.

Bio

Kevin is a post-doc in the AMP lab at UC Berkeley working with Benjamin Recht. He is interested in the theory and practice of algorithms that sequentially collect data using an adaptive strategy. This includes active learning, multi-armed bandit problems, and stochastic optimization. His work ranges from theory, experimental work, to open-source machine learning systems. Kevin received his Ph.D. from the ECE department at the University of Wisconsin - Madison under the advisement of Robert Nowak.