A modified Adam algorithm for deep neural network optimization
TL;DR: In this article , a modified version of the Adam Algorithm, HN Adam, was proposed to improve the generalization performance of deep neural networks by adjusting the step size of the parameter updates over the training epochs.
read more
Abstract: Abstract Deep Neural Networks (DNNs) are widely regarded as the most effective learning tool for dealing with large datasets, and they have been successfully used in thousands of applications in a variety of fields. Based on these large datasets, they are trained to learn the relationships between various variables. The adaptive moment estimation (Adam) algorithm, a highly efficient adaptive optimization algorithm, is widely used as a learning algorithm in various fields for training DNN models. However, it needs to improve its generalization performance, especially when training with large-scale datasets. Therefore, in this paper, we propose HN Adam, a modified version of the Adam Algorithm, to improve its accuracy and convergence speed. The HN_Adam algorithm is modified by automatically adjusting the step size of the parameter updates over the training epochs. This automatic adjustment is based on the norm value of the parameter update formula according to the gradient values obtained during the training epochs. Furthermore, a hybrid mechanism was created by combining the standard Adam algorithm and the AMSGrad algorithm. As a result of these changes, the HN_Adam algorithm, like the stochastic gradient descent (SGD) algorithm, has good generalization performance and achieves fast convergence like other adaptive algorithms. To test the proposed HN_Adam algorithm performance, it is evaluated to train a deep convolutional neural network (CNN) model that classifies images using two different standard datasets: MNIST and CIFAR-10. The algorithm results are compared to the basic Adam algorithm and the SGD algorithm, in addition to other five recent SGD adaptive algorithms. In most comparisons, the HN Adam algorithm outperforms the compared algorithms in terms of accuracy and convergence speed. AdaBelief is the most competitive of the compared algorithms. In terms of testing accuracy and convergence speed (represented by the consumed training time), the HN-Adam algorithm outperforms the AdaBelief algorithm by an improvement of 1.0% and 0.29% for the MNIST dataset, and 0.93% and 1.68% for the CIFAR-10 dataset, respectively.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Large Language Models in Education: Vision and Opportunities
Wensheng Gan,Zhenlian Qi,Jiayang Wu,Chun-Wei Lin +3 more
TL;DR: This article aims to investigate and summarize the application of LLMs in smart education, and provides guidance and insights for educators, researchers, and policy-makers to gain a deep understanding of the potential and challenges of LLM4Edu.
39
Large Language Models in Education: Vision and Opportunities
Wensheng Gan,QI Zhong-ying,Jiayang Wu,Jerry Chun‐Wei Lin +3 more
- 15 Dec 2023
TL;DR: Large language models in education offer personalized learning, intelligent tutoring, and educational assessment opportunities, improving the quality of education and learning experience.
23
The WuC-Adam algorithm based on joint improvement of Warmup and cosine annealing algorithms.
Can Zhang,Yichuan Shao,Haijing Sun,Lei Xing,Qian Zhao,Le Zhang +5 more
- 01 Jan 2024
TL;DR: This study introduces WuC-Adam, an enhanced Adam optimization algorithm integrating Warmup and cosine annealing techniques to alleviate local optima, overfitting, and convergence issues, achieving significant improvements in model convergence speed and generalization performance on MNIST, CIFAR10, and CIFAR100 datasets.
16
Attention to Monkeypox: An Interpretable Monkeypox Detection Technique Using Attention Mechanism
Avi Deb Raha,Mrityunjoy Gain,Rameswar Debnath,Apurba Adhikary,Yu Qiao,Anupam Kumar Bairagi,Sheikh Mohammed Shariful Islam +6 more
TL;DR: An attention-based MobileNetV2 model for monkeypox detection, capitalizing on the inherent lightweight design of MobileNetV2 for effective deployment on edge devices, is proposed, and demonstrates impressive results.
12
Reconfigurable in-sensor processing based on a multi-phototransistor–one-memristor array
Bingjie Dang,Teng Zhang,Xulei Wu,Keqin Liu,Ru Huang,Yuchao Yang +5 more
12
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
117.9K
ImageNet classification with deep convolutional neural networks
TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Related Papers (5)
Hengyue Pan,Hui Jiang +1 more
- 12 Jul 2015
Yarin Gal,Riashat Islam,Zoubin Ghahramani +2 more
- 27 Nov 2017