Open AccessPosted Content
Constrained Bayesian Optimization with Max-Value Entropy Search
TL;DR: This work proposes constrained Max-value Entropy Search (cMES), a novel information theoretic-based acquisition function implementing this formulation of Gaussian process-based BO with continuous or binary constraints, and revisits the validity of the factorized approximation adopted for rapid computation of the MES acquisition function.
read more
Abstract: Bayesian optimization (BO) is a model-based approach to sequentially optimize expensive black-box functions, such as the validation error of a deep neural network with respect to its hyperparameters. In many real-world scenarios, the optimization is further subject to a priori unknown constraints. For example, training a deep network configuration may fail with an out-of-memory error when the model is too large. In this work, we focus on a general formulation of Gaussian process-based BO with continuous or binary constraints. We propose constrained Max-value Entropy Search (cMES), a novel information theoretic-based acquisition function implementing this formulation. We also revisit the validity of the factorized approximation adopted for rapid computation of the MES acquisition function, showing empirically that this leads to inaccurate results. On an extensive set of real-world constrained hyperparameter optimization problems we show that cMES compares favourably to prior work, while being simpler to implement and faster than other constrained extensions of Entropy Search.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Recent Advances in Bayesian Optimization
TL;DR: This paper attempts to provide a comprehensive and updated survey of recent advances in Bayesian optimization and identify interesting open problems and promising future research directions.
155
Bayesian optimization of nanoporous materials
Aryan Deshwal,Cory M. Simon,Janardhan Rao Doppa +2 more
- 06 Oct 2021
TL;DR: Bayesian optimization as discussed by the authors uses a surrogate model and an acquisition function to search for the optimal NPM in a library of NPMs and find it using the fewest experiments.
68
•Posted Content
Amazon SageMaker Autopilot: a white box AutoML solution at scale
Piali Das,Valerio Perrone,Nikita Ivkin,Tanya Bansal,Zohar Karnin,Huibin Shen,Iaroslav Shcherbatyi,Yotam Elor,Wilton Wu,Aida Zolic,Thibaut Lienart,Alex Tang,Amr Ahmed,Jean Baptiste Faddoul,Rodolphe Jenatton,Fela Winkelmolen,Philip Gautier,Leo Parker Dirac,Andre Perunicic,Miroslav Miladinovic,Giovanni Zappella,Cédric Archambeau,Matthias Seeger,Bhaskar Dutt,Laurence Rouesnel +24 more
TL;DR: The different components in the eco-system of Autopilot are described, emphasizing the infrastructure choices that allow scalability, high quality models, editable ML pipelines, consumption of artifacts of offline meta-learning, and a convenient integration with the entire SageMaker system allowing these trained models to be used in a production setting.
Amazon SageMaker Automatic Model Tuning: Scalable Gradient-Free Optimization
Valerio Perrone,Huibin Shen,Aida Zolic,Iaroslav Shcherbatyi,Amr Ahmed,Tanya Bansal,Michele Donini,Fela Winkelmolen,Rodolphe Jenatton,Jean Baptiste Faddoul,Barbara Pogorzelska,Miroslav Miladinovic,Krishnaram Kenthapadi,Matthias Seeger,Cédric Archambeau +14 more
- 14 Aug 2021
TL;DR: AMT as discussed by the authors leverages either random search or Bayesian optimization to choose the hyperparameter values resulting in the best model, as measured by the metric chosen by the user, and can be used with built-in algorithms, custom algorithms, and Amazon SageMaker pre-built containers for machine learning frameworks.
•Posted Content
Max-value Entropy Search for Multi-objective Bayesian Optimization with Constraints.
TL;DR: A Bayesian optimization method that can be used to solve constrained multi-objective problems when the objectives and the constraints are expensive to evaluate, and its execution time is smaller than other information-based methods.
19
References
•Journal Article
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +15 more
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
LIBSVM: A library for support vector machines
Chih-Chung Chang,Chih-Jen Lin +1 more
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Pattern Recognition and Machine Learning
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
30.8K
•Posted Content
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Andreas Müller,Joel Nothman,Gilles Louppe,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +18 more
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
28.9K
•Book
Pattern Recognition and Machine Learning
Christopher M. Bishop
- 17 Aug 2006
TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.