Open AccessPosted Content
Optuna: A Next-generation Hyperparameter Optimization Framework
TL;DR: New design-criteria for next-generation hyperparameter optimization software are introduced, including define-by-run API that allows users to construct the parameter search space dynamically, and easy-to-setup, versatile architecture that can be deployed for various purposes.
read more
Abstract: The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (this https URL).
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Figure 8: Optuna dashboard. This example shows the online transition of objective values, the parallel coordinates plot of sampled parameters, the learning curves, and the tabular descriptions of investigated trials. 
Figure 7: Distributed optimization in Optuna. Figure (a) is the optimization script executed by one worker. Figure (b) is an example shell for the optimization with multiple workers in a distributed environment. 
Table 1: Software frameworks for deep learning and hyperparameter optimization, sorted by their API styles: define-and-run and define-by-run. 
Table 2: Comparison of previous hyperparameter optimization frameworks and Optuna. There is a checkmark for lightweight if the setup for the framework is easy and it can be easily used for lightweight purposes. 
Figure 12: Distributed hyperparameter optimization process for the minimization of average test errors of simplified AlexNet for SVHN dataset. The optimization was done with ASHA pruning. 
Figure 11: The transition of average test errors of simplified AlexNet for SVHN dataset. Figure (a) illustrates the effect of pruning mechanisms on TPE and random search. Figure (b) illustrates the effect of the number of workers on the performance. Figure (c) plots the test errors against the number of trials for different number of workers. Note that the number of workers has no effect on the relation between the number of executed trials and the test error. The result also shows the superiority of ASHA pruning over median pruning.
Citations
Transition network grammars for natural language analysis
TL;DR: The use of augmented transition network grammars for the analysis of natural language sentences is described, and structure-building actions associated with the arcs of the grammar network allow for a powerful selectivity which can rule out meaningless analyses and take advantage of semantic information to guide the parsing.
Deep physical neural networks trained with backpropagation
Logan G. Wright,Tatsuhiro Onodera,Martin M. Stein,Tianyu Wang,Darren T. Schachter,Zoey Hu,Peter L. McMahon +6 more
TL;DR: Physical Neural Networks as discussed by the authors automatically train the functionality of any sequence of real physical systems, directly, using backpropagation, the same technique used for modern deep neural networks, using three diverse physical systems-optical, mechanical, and electrical.
Covasim: an agent-based model of COVID-19 dynamics and interventions
Cliff C. Kerr,Robyn M. Stuart,Robyn M. Stuart,Dina Mistry,Romesh G. Abeysuriya,Gregory R. Hart,Katherine Rosenfeld,Prashanth Selvaraj,Rafael C. Núñez,Brittany Hagedorn,Lauren George,Amanda S Izzo,Anna Palmer,Dominic Delport,Carrie Bennette,Bradley G. Wagner,Stewart T. Chang,Jamie A. Cohen,Jasmina Panovska-Griffiths,Michał Jastrzębski,Assaf P. Oron,Edward Allen Wenger,Michael Famulare,Daniel J. Klein +23 more
TL;DR: The methodology of Covasim (COVID-19 Agent-based Simulator), an open-source model developed to help address the urgent need for models that can project epidemic trends, explore intervention scenarios, and estimate resource needs, is described.
•Posted Content
Data Augmentation for Graph Neural Networks
TL;DR: This work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra- class edges and demote inter-class edges in given graph structure, and introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction.
421
•Posted Content
A System for Massively Parallel Hyperparameter Tuning
Liam Li,Kevin Jamieson,Afshin Rostamizadeh,Ekaterina Gonina,Moritz Hardt,Benjamin Recht,Ameet Talwalkar +6 more
TL;DR: This work introduces a simple and robust hyperparameter optimization algorithm called ASHA, which exploits parallelism and aggressive early-stopping to tackle large-scale hyperparameters optimization problems, and shows that ASHA outperforms existing state-of-the-art hyper parameter optimization methods.
300
References
ImageNet classification with deep convolutional neural networks
TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
•Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton +2 more
- 03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Automatic differentiation in PyTorch
Adam Paszke,Sam Gross,Soumith Chintala,Gregory Chanan,Edward Z. Yang,Zachary DeVito,Zeming Lin,Alban Desmaison,Luca Antiga,Adam Lerer +9 more
- 28 Oct 2017
TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
•Posted Content
TensorFlow: A system for large-scale machine learning
Martín Abadi,Paul Barham,Jianmin Chen,Zhifeng Chen,Andy Davis,Jeffrey Dean,Matthieu Devin,Sanjay Ghemawat,Geoffrey Irving,Michael Isard,Manjunath Kudlur,Josh Levenberg,Rajat Monga,Sherry Moore,Derek G. Murray,Benoit Steiner,Paul A. Tucker,Vijay K. Vasudevan,Pete Warden,Martin Wicke,Yuan Yu,Xiaoqiang Zheng +21 more
TL;DR: The TensorFlow dataflow model is described and the compelling performance that Tensor Flow achieves for several real-world applications is demonstrated.
TensorFlow: a system for large-scale machine learning
Martín Abadi,Paul Barham,Jianmin Chen,Zhifeng Chen,Andy Davis,Jeffrey Dean,Matthieu Devin,Sanjay Ghemawat,Geoffrey Irving,Michael Isard,Manjunath Kudlur,Josh Levenberg,Rajat Monga,Sherry Moore,Derek G. Murray,Benoit Steiner,Paul A. Tucker,Vijay K. Vasudevan,Pete Warden,Martin Wicke,Yuan Yu,Xiaoqiang Zheng +21 more
- 02 Nov 2016
TL;DR: TensorFlow as mentioned in this paper is a machine learning system that operates at large scale and in heterogeneous environments, using dataflow graphs to represent computation, shared state, and the operations that mutate that state.
Related Papers (5)
James Bergstra,Rémi Bardenet,Yoshua Bengio,Balázs Kégl +3 more
- 12 Dec 2011
Tianqi Chen,Carlos Guestrin +1 more