Nonparametric Method for Data-driven Image Captioning

doi:10.3115/V1/P14-2097

Open AccessProceedings Article10.3115/V1/P14-2097

Nonparametric Method for Data-driven Image Captioning

Rebecca Mason, +1 more

- 01 Jun 2014

- pp 592-598

123

TL;DR: This work addresses the challenge of noisy estimations of visual content and poor alignment between images and human-written captions by estimating a word frequency representation of the visual content of a query image to cast caption generation as an extractive summarization problem.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Pattern Recognition and Machine Learning

Christopher M. Bishop

- 01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

10.1K

•Journal Article•10.1109/TPAMI.2018.2798607

Multimodal Machine Learning: A Survey and Taxonomy

Tadas Baltrusaitis, +2 more

- 01 Feb 2019

- IEEE Transactions on Pattern Analysis an...

TL;DR: This paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy to enable researchers to better understand the state of the field and identify directions for future research.

...read moreread less

3.4K

•Posted Content

Microsoft COCO Captions: Data Collection and Evaluation Server

Xinlei Chen, +6 more

- 01 Apr 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: The Microsoft COCO Caption dataset and evaluation server are described and several popular metrics, including BLEU, METEOR, ROUGE and CIDEr are used to score candidate captions.

...read moreread less

2.6K

Proceedings Article•10.1109/CVPR.2015.7298966

Deep correlation for matching images and text

Fei Yan, +1 more

- 07 Jun 2015

TL;DR: This paper addresses the problem of matching images and captions in a joint latent space learnt with deep canonical correlation analysis (DCCA) by a GPU implementation and proposes methods to deal with overfitting.

...read moreread less

508

•Proceedings Article•10.1109/ICCV.2015.277

Guiding the Long-Short Term Memory Model for Image Caption Generation

Xu Jia, +3 more

- 07 Dec 2015

TL;DR: In this article, an extension of the LSTM model is proposed to add semantic information extracted from the image as extra input to each unit, with the aim of guiding the model towards solutions that are more tightly coupled to the image content.

...read moreread less

498

...

Expand

References

Journal Article•10.1198/TECH.2007.S518

Pattern Recognition and Machine Learning

Radford M. Neal

- 01 Aug 2007

- Technometrics

TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.

...read moreread less

30.8K

•Proceedings Article•10.3115/1073083.1073135

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

- 06 Jul 2002

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

28.9K

•Book

Pattern Recognition and Machine Learning

Christopher M. Bishop

- 17 Aug 2006

TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.

...read moreread less

23.4K

Pattern Recognition and Machine Learning

Christopher M. Bishop

- 01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

10.1K

Journal Article•10.1023/A:1011139631724

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

Aude Oliva, +1 more

- 01 May 2001

- International Journal of Computer Vision

TL;DR: The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.

...read moreread less

7.5K

...

Expand

Nonparametric Method for Data-driven Image Captioning

Chat with Paper

AI Agents for this Paper

Citations

Pattern Recognition and Machine Learning

Multimodal Machine Learning: A Survey and Taxonomy

Microsoft COCO Captions: Data Collection and Evaluation Server

Deep correlation for matching images and text

Guiding the Long-Short Term Memory Model for Image Caption Generation

References

Pattern Recognition and Machine Learning

Bleu: a Method for Automatic Evaluation of Machine Translation

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

Related Papers (5)

Show and tell: A neural image caption generator

Bleu: a Method for Automatic Evaluation of Machine Translation

Deep visual-semantic alignments for generating image descriptions

Microsoft COCO: Common Objects in Context

Im2Text: Describing Images Using 1 Million Captioned Photographs