Image Captioning based on Deep Learning Methods: A Survey.

Open AccessPosted Content

Image Captioning based on Deep Learning Methods: A Survey.

- 20 May 2019

- arXiv: Computer Vision and Pattern Recog...

9

TL;DR: A survey on advances in image captioning based on Deep Learning methods, including Encoder-Decoder structure, improved methods in Encoder,Improved methods in Decoder, and other improvements is presented.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1080/23311916.2022.2104333

Semantic interdisciplinary evaluation of image captioning models

Uddagiri Sirisha, +1 more

- 21 Aug 2022

- Cogent engineering

TL;DR: In this article , the authors examine and analyze different image captioning models used across various domains, and multiple insights are extracted to determine the best combinational architecture for a new application without ignoring contextual semantics.

...read moreread less

17

Proceedings Article•10.1109/CISS50987.2021.9400209

Facilitated Deep Learning Models for Image Captioning

Imtinan Azhar, +2 more

- 24 Mar 2021

TL;DR: In this paper, a mixture of object detection and attention-enriched deep learning models is used to extract the image features, and then an extended version of Recurrent Neural Networks (LSTM) with attention-enhanced features is adopted to generate the caption.

...read moreread less

11

•Journal Article•10.14569/ijacsa.2023.0140249

A Survey on Attention-Based Models for Image Captioning

Asmaa A. E. Osman, +3 more

- 01 Jan 2023

- International Journal of Advanced Comput...

TL;DR: A survey on attention-based models for image captioning is presented in this article , including new categories that were not included in other survey papers, and all categories and subcategories of the attentionbased approaches are discussed in detail.

...read moreread less

9

Proceedings Article•10.1109/iccubea58933.2023.10392213

An Enhanced Hybrid Deep Learning Model for Efficient Automatic Image Captioning

Eliyah Immanuel Thavaraj A, +2 more

- 18 Aug 2023

TL;DR: This paper has utilized the attention mechanism to generate captions for images after considering the recognized items in the image scene and the semantic similarity analysis between produced descriptions and the actual image description is carried out.

...read moreread less

1

Proceedings Article•10.1109/indicon56171.2022.10039829

Sequential Memory Modelling for Video Captioning

24 Nov 2022

TL;DR: In this paper , an encoder-decoder network end-in-frame based on a deep learning approach was used to generate video subtitles, and the model, dataset and parameters used to evaluate the model.

...read moreread less

References

Journal Article•10.1162/NECO.1997.9.8.1735

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997

- Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

99K

•Book Chapter•10.1007/978-3-319-10602-1_48

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

- 06 Sep 2014

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

51.7K

•Proceedings Article•10.3115/1073083.1073135

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

- 06 Jul 2002

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

28.9K

•Posted Content

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 04 Jun 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

25.3K

•Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

- 08 Dec 2014

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

20.1K

...

Expand

Image Captioning based on Deep Learning Methods: A Survey.

Chat with Paper

AI Agents for this Paper

Citations

Semantic interdisciplinary evaluation of image captioning models

Facilitated Deep Learning Models for Image Captioning

A Survey on Attention-Based Models for Image Captioning

An Enhanced Hybrid Deep Learning Model for Efficient Automatic Image Captioning

Sequential Memory Modelling for Video Captioning

References

Long short-term memory

Microsoft COCO: Common Objects in Context

Bleu: a Method for Automatic Evaluation of Machine Translation

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Sequence to Sequence Learning with Neural Networks

Related Papers (5)

A survey on deep neural network-based image captioning

Automatic Caption Generation for Medical Images

When Radiology Report Generation Meets Knowledge Graph

Image Captioning with Attention Based Model

Review based on Image Understanding Approaches