An LP-based hyperparameter optimization model for language modeling
TL;DR: In this paper, a fractional nonlinear programming model is proposed to find the optimal perplexity value for a language model, which is the first attempt to use optimization techniques to find perplexity values in the literature.
read more
Abstract: In order to find hyperparameters for a machine learning model, algorithms such as grid search or random search are used over the space of possible values of the models hyperparameters. These search algorithms opt the solution that minimizes a specific cost function. In language models, perplexity is one of the most popular cost functions. In this study, we propose a fractional nonlinear programming model that finds the optimal perplexity value. The special structure of the model allows us to approximate it by a linear programming model that can be solved using the well-known simplex algorithm. To the best of our knowledge, this is the first attempt to use optimization techniques to find perplexity values in the language modeling literature. We apply our model to find hyperparameters of a language model and compare it to the grid search algorithm. Furthermore, we illustrating that it results in lower perplexity values. We perform this experiment on a real-world dataset from SwiftKey to validate our proposed approach.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Improving Academic Conferences – Criticism and Suggestions Utilizing Natural Language Processing
Eyal Eckhaus,Nitza Davidovitch +1 more
TL;DR: In this paper, the authors focus on the evaluation of academic conferences and ways of improving them and examine the effect of age, seniority, and the number of times the respondents had initiated or served as a partner in initiating a conference.
TBR-NER: Research on COVID-19 Text Information Extraction Based on Joint Learning of Topic Recognition and Named Entity Recognition
TL;DR: A joint learning text information extraction method based on topic recognition and named entity recognition to predict the labeled risk areas and epidemic trajectory information in text information and shows that the TBR-NER model has specific sociality and applicability and can help in epidemic prediction, prevention, and control.
5
A Deep Learning Framework for Coronavirus Disease (COVID-19) Detection in X-Ray Images
Tayyip Ozcan
- 05 May 2020
TL;DR: A grid search and pre-trained model aided convolutional neural network (CNN) model is proposed to detect COVID-19 in X-Ray images and according to the experimental studies, the best results were obtained with the GS and ResNet50 aided model.
References
•Journal Article
Random search for hyper-parameter optimization
James Bergstra,Yoshua Bengio +1 more
TL;DR: This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid, and shows that random search is a natural baseline against which to judge progress in the development of adaptive (sequential) hyper- parameter optimization algorithms.
A neural probabilistic language model
TL;DR: The authors propose to learn a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, which can be expressed in terms of these representations.
•Proceedings Article
Recurrent neural network based language model
Tomas Mikolov,Martin Karafiat,Lukas Burget,Jan Cernocký,Sanjeev Khudanpur +4 more
- 01 Jan 2010
TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.
•Book
Nonlinear Programming: Theory and Algorithms
Mokhtar S. Bazaraa
- 03 Mar 1993
TL;DR: The book is a solid reference for professionals as well as a useful text for students in the fields of operations research, management science, industrial engineering, applied mathematics, and also in engineering disciplines that deal with analytical optimization techniques.
6.4K
•Proceedings Article
Algorithms for Hyper-Parameter Optimization
James Bergstra,Rémi Bardenet,Yoshua Bengio,Balázs Kégl +3 more
- 12 Dec 2011
TL;DR: This work contributes novel techniques for making response surface models P(y|x) in which many elements of hyper-parameter assignment (x) are known to be irrelevant given particular values of other elements.