Journal Article10.1007/S11263-011-0449-8
Harmony Potentials
Xavier Boix,Josep M. Gonfaus,Joost van de Weijer,Andrew D. Bagdanov,Joan Serrat,Jordi Gonzàlez +5 more
103
TL;DR: A new consistency potential is proposed for image labeling problems that can encode any possible combination of labels, penalizing only unlikely combinations of classes, and an effective sampling strategy is proposed over this expanded label set that renders tractable the underlying optimization problem.
read more
Abstract: The Hierarchical Conditional Random Field (HCRF) model have been successfully applied to a number of image labeling problems, including image segmentation. However, existing HCRF models of image segmentation do not allow multiple classes to be assigned to a single region, which limits their ability to incorporate contextual information across multiple scales. At higher scales in the image, this representation yields an oversimplified model since multiple classes can be reasonably expected to appear within large regions. This simplified model particularly limits the impact of information at higher scales. Since class-label information at these scales is usually more reliable than at lower, noisier scales, neglecting this information is undesirable. To address these issues, we propose a new consistency potential for image labeling problems, which we call the harmony potential. It can encode any possible combination of labels, penalizing only unlikely combinations of classes. We also propose an effective sampling strategy over this expanded label set that renders tractable the underlying optimization problem. Our approach obtains state-of-the-art results on two challenging, standard benchmark datasets for semantic image segmentation: PASCAL VOC 2010, and MSRC-21.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Simultaneous Detection and Segmentation
Bharath Hariharan,Pablo Arbeláez,Pablo Arbeláez,Ross Girshick,Jitendra Malik +4 more
- 06 Sep 2014
TL;DR: This work builds on recent work that uses convolutional neural networks to classify category-independent region proposals (R-CNN), introducing a novel architecture tailored for SDS, and uses category-specific, top-down figure-ground predictions to refine the bottom-up proposals.
COCO-Stuff: Thing and Stuff Classes in Context
Holger Caesar,Jasper Uijlings,Vittorio Ferrari +2 more
- 17 Dec 2018
TL;DR: COCO-Stuff as mentioned in this paper augments all 164k images of the COCO 2017 dataset with pixel-wise annotations for 91 stuff classes, which leverages the original thing annotations.
•Posted Content
COCO-Stuff: Thing and Stuff Classes in Context
TL;DR: An efficient stuff annotation protocol based on superpixels is introduced, which leverages the original thing annotations, and the speed versus quality trade-off of the protocol is quantified and the relation between annotation time and boundary complexity is explored.
Feedforward semantic segmentation with zoom-out features
Mohammadreza Mostajabi,Payman Yadollahpour,Gregory Shakhnarovich +2 more
- 07 Jun 2015
TL;DR: In this article, a feed-forward architecture for semantic segmentation is proposed, which maps small image elements (superpixels) to rich feature representations extracted from a sequence of nested regions of increasing extent.
Towards unified depth and semantic prediction from a single image
Peng Wang,Xiaohui Shen,Zhe Lin,Scott Cohen,Brian Price,Alan L. Yuille +5 more
- 07 Jun 2015
TL;DR: This work proposes a unified framework for joint depth and semantic prediction that effectively leverages the advantages of both tasks and provides the state-of-the-art results.
References
Distinctive Image Features from Scale-Invariant Keypoints
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
The Pascal Visual Object Classes (VOC) Challenge
TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.
Multiresolution gray-scale and rotation invariant texture classification with local binary patterns
TL;DR: A generalized gray-scale and rotation invariant operator presentation that allows for detecting the "uniform" patterns for any quantization of the angular space and for any spatial resolution and presents a method for combining multiple operators for multiresolution analysis.
Distinctive Image Features from Scale-Invariant Keypoints
Matthijs Dorst
- 01 Jan 2011
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
15.8K
Mean shift: a robust approach toward feature space analysis
Dorin Comaniciu,Peter Meer +1 more
TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
12.9K