Adversarial Image Caption Generator Network
Ali Mollaahmadi Dehaqi,Vahid Seydi,Yeganeh Madadi,Yeganeh Madadi +3 more
- 01 May 2021
- Vol. 2, Iss: 3, pp 1-14
TL;DR: Zhang et al. as discussed by the authors proposed a novel model based on GAN networks where it generates the caption of the image through the representation of image by utilizing the generator adversarial network and it does not need any secondary learning algorithm like policy gradient.
read more
Abstract: Image captioning is a task to make an image description, which needs recognizing the important attributes and also their relationships in the image. This task requires to generate semantically and syntactically correct sentences. Most image captioning models are based on RNN and MLE methods, but we propose a novel model based on GAN networks where it generates the caption of the image through the representation of the image by utilizing the generator adversarial network and it does not need any secondary learning algorithm like policy gradient. Due to the complexity of benchmark datasets such as Flickr and Coco, in both volume and complexity, we introduce a new dataset and perform the experiments on it. The experimental results show the effectiveness of our model compared to the state-of-the-art image captioning methods.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep image captioning: A review of methods, trends and future challenges
TL;DR: Wang et al. as discussed by the authors presented common-used feature representation, visual encoding and language generation models, and summarized typical caption methods which are generally divided into that with or without using reinforcement learning.
26
References
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K
•Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton +2 more
- 03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
•Journal Article
Visualizing Data using t-SNE
TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
•Book
Reinforcement Learning: An Introduction
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Nonlinear principal component analysis using autoassociative neural networks
TL;DR: The NLPCA method is demonstrated using time-dependent, simulated batch reaction data and shows that it successfully reduces dimensionality and produces a feature space map resembling the actual distribution of the underlying system parameters.
3.2K