Few-Shot Bayesian Optimization with Deep Kernel Surrogates.

Open AccessPosted Content

Few-Shot Bayesian Optimization with Deep Kernel Surrogates.

- 19 Jan 2021

1

TL;DR: In this article, a deep kernel network is used for a Gaussian process surrogate that is meta-learned in an end-to-end fashion in order to jointly approximate the response functions of a collection of training data sets.

Abstract: Hyperparameter optimization (HPO) is a central pillar in the automation of machine learning solutions and is mainly performed via Bayesian optimization, where a parametric surrogate is learned to approximate the black box response function (e.g. validation error). Unfortunately, evaluating the response function is computationally intensive. As a remedy, earlier work emphasizes the need for transfer learning surrogates which learn to optimize hyperparameters for an algorithm from other tasks. In contrast to previous work, we propose to rethink HPO as a few-shot learning problem in which we train a shared deep surrogate model to quickly adapt (with few response evaluations) to the response function of a new task. We propose the use of a deep kernel network for a Gaussian process surrogate that is meta-learned in an end-to-end fashion in order to jointly approximate the response functions of a collection of training data sets. As a result, the novel few-shot optimization of our deep kernel surrogate leads to new state-of-the-art results at HPO compared to several recent methods on diverse metadata sets.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for Hyperparameter Recommendation.

Yuxin Xiao, +2 more

- 17 Jun 2021

- arXiv: Learning

TL;DR: In this article, the authors proposed a multi-task multi-fidelity Bayesian optimization framework, which leads to the best instantiation, amortized auto-tuning (AT2), for hyperparameter recommendation.

...read moreread less

3

References

•Proceedings Article

Model-agnostic meta-learning for fast adaptation of deep networks

Chelsea Finn, +2 more

- 06 Aug 2017

TL;DR: An algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning is proposed.

...read moreread less

11.3K

Journal Article•10.2307/1271432

A comparison of three methods for selecting values of input variables in the analysis of output from a computer code

Michael D. McKay, +2 more

- 01 Feb 2000

- Technometrics

TL;DR: In this paper, two sampling plans are examined as alternatives to simple random sampling in Monte Carlo studies and they are shown to be improvements over simple sampling with respect to variance for a class of estimators which includes the sample mean and the empirical distribution function.

...read moreread less

10.3K

•Journal Article

Random search for hyper-parameter optimization

James Bergstra, +1 more

- 01 Mar 2012

- Journal of Machine Learning Research

TL;DR: This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid, and shows that random search is a natural baseline against which to judge progress in the development of adaptive (sequential) hyper- parameter optimization algorithms.

...read moreread less

9.7K

Journal Article•10.1023/A:1008306431147

Efficient Global Optimization of Expensive Black-Box Functions

Donald R. Jones, +2 more

- 01 Dec 1998

- Journal of Global Optimization

TL;DR: This paper introduces the reader to a response surface methodology that is especially good at modeling the nonlinear, multimodal functions that often occur in engineering and shows how these approximating functions can be used to construct an efficient global optimization algorithm with a credible stopping rule.

...read moreread less

7.9K

•Book

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

Carl Edward Rasmussen, +1 more

- 01 Dec 2005

TL;DR: The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and includes detailed algorithms for supervised-learning problem for both regression and classification.

...read moreread less

3.1K

...

Expand

Few-Shot Bayesian Optimization with Deep Kernel Surrogates.

Chat with Paper

AI Agents for this Paper

Citations

Amortized Auto-Tuning: Cost-Efficient Transfer Optimization for Hyperparameter Recommendation.

References

Model-agnostic meta-learning for fast adaptation of deep networks

A comparison of three methods for selecting values of input variables in the analysis of output from a computer code

Random search for hyper-parameter optimization

Efficient Global Optimization of Expensive Black-Box Functions

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

Related Papers (5)

Hyperparameter Optimization with Factorized Multilayer Perceptrons

Adaptive Local Bayesian Optimization Over Multiple Discrete Variables.

Meta-learning for symbolic hyperparameter defaults

HyperSpace: Distributed Bayesian Hyperparameter Optimization

Bayesian optimization with tree-structured dependencies