Open AccessProceedings Article
NAS-Bench-101: Towards Reproducible Neural Architecture Search
Chris Ying,Aaron Klein,Eric Christiansen,Esteban Real,Kevin Murphy,Frank Hutter +5 more
- 24 May 2019
- pp 7105-7114
TL;DR: This work introduces NAS-Bench-101, the first public architecture dataset for NAS research, which allows researchers to evaluate the quality of a diverse range of models in milliseconds by querying the pre-computed dataset.
read more
Abstract: Recent advances in neural architecture search (NAS) demand tremendous computational resources, which makes it difficult to reproduce experiments and imposes a barrier-to-entry to researchers without access to large-scale computation. We aim to ameliorate these problems by introducing NAS-Bench-101, the first public architecture dataset for NAS research. To build NAS-Bench-101, we carefully constructed a compact, yet expressive, search space, exploiting graph isomorphisms to identify 423k unique convolutional architectures. We trained and evaluated all of these architectures multiple times on CIFAR-10 and compiled the results into a large dataset of over 5 million trained models. This allows researchers to evaluate the quality of a diverse range of models in milliseconds by querying the pre-computed dataset. We demonstrate its utility by analyzing the dataset as a whole and by benchmarking a range of architecture optimization algorithms.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
Exploring the Loss Landscape in Neural Architecture Search
TL;DR: In this article, the authors show that the simple hill-climbing algorithm is a powerful baseline for NAS, and when the noise in popular NAS benchmark datasets is reduced to a minimum, hill climbing outperforms many popular state-of-the-art algorithms.
•Posted Content
iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients
TL;DR: In this paper, a stochastic hypergradient approximation for differentiable NAS is proposed, and theoretically show that the architecture optimization with the proposed method, named iDARTS, is expected to converge to a stationary point, making it only depend on the obtained solution to the inner-loop optimization and agnostic to the optimization path.
•Posted Content
Towards Green Automated Machine Learning: Status Quo and Future Directions
Tanja Tornede,Alexander Tornede,Jonas Manuel Hanselle,Marcel Dominik Wever,Felix Mohr,Eyke Hüllermeier +5 more
TL;DR: In this paper, the authors identify four categories of actions the community may take towards more sustainable research on AutoML, namely approach design, benchmarking, research incentives, and transparency.
EdgeTune: Inference-Aware Multi-Parameter Tuning
Isabelly Rocha,P. Felber,Valerio Schiavoni,Lydia Y. Chen +3 more
- 07 Nov 2022
TL;DR: A novel one-fold tuning algorithm that employs the principle of multi-fidelity and simultaneously explores multiple tuning budgets, which the prior art can only handle as suboptimal case of single type of budget is proposed.
•Posted Content
GPNAS: A Neural Network Architecture Search Framework Based on Graphical Predictor
Dige Ai,Hong Zhang +1 more
TL;DR: In this article, the authors propose a framework to decouple network structure from operator search space, and use two BOHBs to search alternatively, which can not only improve the search efficiency, but also solve the dimension curse.
References
•Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
117.9K
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
•Proceedings Article
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever,Oriol Vinyals,Quoc V. Le +2 more
- 08 Dec 2014
TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.
•Posted Content
Decoupled Weight Decay Regularization
Ilya Loshchilov,Frank Hutter +1 more
TL;DR: This work proposes a simple modification to recover the original formulation of weight decay regularization by decoupling the weight decay from the optimization steps taken w.r.t. the loss function, and provides empirical evidence that this modification substantially improves Adam's generalization performance.
14.4K
•Journal Article
Random search for hyper-parameter optimization
James Bergstra,Yoshua Bengio +1 more
TL;DR: This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid, and shows that random search is a natural baseline against which to judge progress in the development of adaptive (sequential) hyper- parameter optimization algorithms.
Related Papers (5)
Hanxiao Liu,Karen Simonyan,Yiming Yang +2 more
- 24 Jun 2018
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016