Top 3467 papers published in the topic of Segmentation in 2014

Showing papers on "Segmentation published in 2014"

Posted Content•

Fully Convolutional Networks for Semantic Segmentation

[...]

Jonathan Long¹, Evan Shelhamer¹, Trevor Darrell¹•Institutions (1)

14 Nov 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is shown that convolutional networks by themselves, trained end- to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation.

...read moreread less

Abstract: Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key insight is to build "fully convolutional" networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a novel architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves state-of-the-art segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012), NYUDv2, and SIFT Flow, while inference takes one third of a second for a typical image.

...read moreread less

9,834 citations

Posted Content•

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs

[...]

Liang-Chieh Chen¹, George Papandreou², Iasonas Kokkinos³, Kevin Murphy², Alan L. Yuille¹ - Show less +1 more•Institutions (3)

University of California, Los Angeles¹, Google², CentraleSupélec³

22 Dec 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF).

...read moreread less

Abstract: Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and object detection. This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification (also called "semantic image segmentation"). We show that responses at the final layer of DCNNs are not sufficiently localized for accurate object segmentation. This is due to the very invariance properties that make DCNNs good for high level tasks. We overcome this poor localization property of deep networks by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF). Qualitatively, our "DeepLab" system is able to localize segment boundaries at a level of accuracy which is beyond previous methods. Quantitatively, our method sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 71.6% IOU accuracy in the test set. We show how these results can be obtained efficiently: Careful network re-purposing and a novel application of the 'hole' algorithm from the wavelet community allow dense computation of neural net responses at 8 frames per second on a modern GPU.

...read moreread less

5,345 citations

Book Chapter•10.1007/978-3-319-10584-0_23•

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

[...]

Saurabh Gupta¹, Ross Girshick¹, Pablo Arbeláez², Pablo Arbeláez¹, Jitendra Malik¹ - Show less +1 more•Institutions (2)

University of California¹, University of Los Andes²

6 Sep 2014

TL;DR: In this paper, a new geocentric embedding for depth images that encodes height above ground and angle with gravity for each pixel in addition to the horizontal disparity is proposed.

...read moreread less

Abstract: In this paper we study the problem of object detection for RGB-D images using semantically rich image and depth features. We propose a new geocentric embedding for depth images that encodes height above ground and angle with gravity for each pixel in addition to the horizontal disparity. We demonstrate that this geocentric embedding works better than using raw depth images for learning feature representations with convolutional neural networks. Our final object detection system achieves an average precision of 37.3%, which is a 56% relative improvement over existing methods. We then focus on the task of instance segmentation where we label pixels belonging to object instances found by our detector. For this task, we propose a decision forest approach that classifies pixels in the detection window as foreground or background using a family of unary and binary tests that query shape and geocentric pose features. Finally, we use the output from our object detectors in an existing superpixel classification framework for semantic scene segmentation and achieve a 24% relative improvement over current state-of-the-art for the object categories that we study. We believe advances such as those represented in this paper will facilitate the use of perception in fields like robotics.

...read moreread less

1,891 citations

Proceedings Article•10.1109/CVPR.2014.119•

The Role of Context for Object Detection and Semantic Segmentation in the Wild

[...]

Roozbeh Mottaghi¹, Xianjie Chen², Xiaobai Liu², Nam-Gyu Cho³, Seong-Whan Lee³, Sanja Fidler⁴, Raquel Urtasun⁴, Alan L. Yuille² - Show less +4 more•Institutions (4)

Stanford University¹, University of California, Los Angeles², Korea University³, University of Toronto⁴

23 Jun 2014

TL;DR: A novel deformable part-based model is proposed, which exploits both local context around each candidate detection as well as global context at the level of the scene, which significantly helps in detecting objects at all scales.

...read moreread less

Abstract: In this paper we study the role of context in existing state-of-the-art detection and segmentation approaches. Towards this goal, we label every pixel of PASCAL VOC 2010 detection challenge with a semantic category. We believe this data will provide plenty of challenges to the community, as it contains 520 additional classes for semantic segmentation and object detection. Our analysis shows that nearest neighbor based approaches perform poorly on semantic segmentation of contextual classes, showing the variability of PASCAL imagery. Furthermore, improvements of exist ing contextual models for detection is rather modest. In order to push forward the performance in this difficult scenario, we propose a novel deformable part-based model, which exploits both local context around each candidate detection as well as global context at the level of the scene. We show that this contextual reasoning significantly helps in detecting objects at all scales.

...read moreread less

1,885 citations

Posted Content•

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture

[...]

David Eigen¹, Rob Fergus²•Institutions (2)

New York University¹, Facebook²

18 Nov 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, a multiscale convolutional network is used to adapt easily to each task using only small modifications, regressing from the input image to the output map directly.

...read moreread less

Abstract: In this paper we address three different computer vision tasks using a single basic architecture: depth prediction, surface normal estimation, and semantic labeling. We use a multiscale convolutional network that is able to adapt easily to each task using only small modifications, regressing from the input image to the output map directly. Our method progressively refines predictions using a sequence of scales, and captures many image details without any superpixels or low-level segmentation. We achieve state-of-the-art performance on benchmarks for all three tasks.

...read moreread less

1,694 citations

Book Chapter•10.1007/978-3-319-10584-0_20•

Simultaneous Detection and Segmentation

[...]

Bharath Hariharan¹, Pablo Arbeláez², Pablo Arbeláez¹, Ross Girshick¹, Jitendra Malik¹ - Show less +1 more•Institutions (2)

University of California¹, University of Los Andes²

6 Sep 2014

TL;DR: This work builds on recent work that uses convolutional neural networks to classify category-independent region proposals (R-CNN), introducing a novel architecture tailored for SDS, and uses category-specific, top-down figure-ground predictions to refine the bottom-up proposals.

...read moreread less

Abstract: We aim to detect all instances of a category in an image and, for each instance, mark the pixels that belong to it. We call this task Simultaneous Detection and Segmentation (SDS). Unlike classical bounding box detection, SDS requires a segmentation and not just a box. Unlike classical semantic segmentation, we require individual object instances. We build on recent work that uses convolutional neural networks to classify category-independent region proposals (R-CNN [16]), introducing a novel architecture tailored for SDS. We then use category-specific, top-down figure-ground predictions to refine our bottom-up proposals. We show a 7 point boost (16% relative) over our baselines on SDS, a 5 point boost (10% relative) over state-of-the-art on semantic segmentation, and state-of-the-art performance in object detection. Finally, we provide diagnostic tools that unpack performance and provide directions for future work.

...read moreread less

1,677 citations

Proceedings Article•10.1109/CVPR.2014.43•

The Secrets of Salient Object Segmentation

[...]

Yin Li¹, Xiaodi Hou², Christof Koch³, James M. Rehg¹, Alan L. Yuille⁴ - Show less +1 more•Institutions (4)

Georgia Institute of Technology¹, California Institute of Technology², Allen Institute for Brain Science³, University of California, Los Angeles⁴

23 Jun 2014

TL;DR: An extensive evaluation of fixation prediction and salient object segmentation algorithms as well as statistics of major datasets identifies serious design flaws of existing salient object benchmarks and proposes a new high quality dataset that offers both fixation and salient objects segmentation ground-truth.

...read moreread less

Abstract: In this paper we provide an extensive evaluation of fixation prediction and salient object segmentation algorithms as well as statistics of major datasets. Our analysis identifies serious design flaws of existing salient object benchmarks, called the dataset design bias, by over emphasising the stereotypical concepts of saliency. The dataset design bias does not only create the discomforting disconnection between fixations and salient object segmentation, but also misleads the algorithm designing. Based on our analysis, we propose a new high quality dataset that offers both fixation and salient object segmentation ground-truth. With fixations and salient object being presented simultaneously, we are able to bridge the gap between fixations and salient objects, and propose a novel method for salient object segmentation. Finally, we report significant benchmark progress on 3 existing datasets of segmenting salient objects.

...read moreread less

1,447 citations

Posted Content•

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

[...]

Saurabh Gupta¹, Ross Girshick¹, Pablo Arbeláez², Pablo Arbeláez¹, Jitendra Malik¹ - Show less +1 more•Institutions (2)

University of California¹, University of Los Andes²

22 Jul 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: A new geocentric embedding is proposed for depth images that encodes height above ground and angle with gravity for each pixel in addition to the horizontal disparity to facilitate the use of perception in fields like robotics.

...read moreread less

1,059 citations

Journal Article•10.1016/J.MEDIA.2013.12.002•

Evaluation of prostate segmentation algorithms for MRI: the PROMISE12 challenge.

[...]

Geert Litjens¹, Robert Toth², Wendy J. M. van de Ven¹, Caroline M. A. Hoeks¹, Sjoerd Kerkstra¹, Bram van Ginneken¹, G.R. Vincent, Gwenael Guillard, Neil Birbeck³, Jindang Zhang³, Robin Strand⁴, Filip Malmberg⁴, Yangming Ou⁵, Christos Davatzikos⁵, Matthias Kirschner⁶, Florian Jung⁶, Jing Yuan⁷, Wu Qiu⁷, Qinquan Gao⁸, Philip J. Edwards⁸, Bianca Maan⁹, Ferdinand van der Heijden⁹, Soumya Ghose¹⁰, Soumya Ghose¹¹, Jhimli Mitra¹⁰, Jhimli Mitra¹¹, Jason Dowling¹¹, Dean C. Barratt¹², Henkjan J. Huisman¹, Anant Madabhushi² - Show less +26 more•Institutions (12)

Radboud University Nijmegen Medical Centre¹, Case Western Reserve University², Siemens³, Uppsala University⁴, University of Pennsylvania⁵, Technische Universität Darmstadt⁶, Robarts Research Institute⁷, Imperial College London⁸, University of Twente⁹, University of Burgundy¹⁰, Commonwealth Scientific and Industrial Research Organisation¹¹, University College London¹²

01 Feb 2014-Medical Image Analysis

TL;DR: Although average algorithm performance was good to excellent and the Imorphics algorithm outperformed the second observer on average, it is shown that algorithm combination might lead to further improvement, indicating that optimal performance for prostate segmentation is not yet obtained.

...read moreread less

811 citations

Journal Article•10.1016/J.ISPRSJPRS.2013.11.018•

Automated parameterisation for multi-scale image segmentation on multiple layers.

[...]

Lucian Drăguţ¹, Ovidiu Csillik¹, Clemens Eisank², Dirk Tiede²•Institutions (2)

West University of Timișoara¹, University of Salzburg²

01 Feb 2014-Isprs Journal of Photogrammetry and Remote Sensing

TL;DR: A new automated approach to parameterising multi-scale image segmentation of multiple layers based on the potential of the local variance (LV) to detect scale transitions in geospatial data is introduced and implemented as a generic tool for the eCognition® software.

...read moreread less

Abstract: We introduce a new automated approach to parameterising multi-scale image segmentation of multiple layers, and we implemented it as a generic tool for the eCognition® software. This approach relies on the potential of the local variance (LV) to detect scale transitions in geospatial data. The tool detects the number of layers added to a project and segments them iteratively with a multiresolution segmentation algorithm in a bottom-up approach, where the scale factor in the segmentation, namely, the scale parameter (SP), increases with a constant increment. The average LV value of the objects in all of the layers is computed and serves as a condition for stopping the iterations: when a scale level records an LV value that is equal to or lower than the previous value, the iteration ends, and the objects segmented in the previous level are retained. Three orders of magnitude of SP lags produce a corresponding number of scale levels. Tests on very high resolution imagery provided satisfactory results for generic applicability. The tool has a significant potential for enabling objectivity and automation of GEOBIA analysis.

...read moreread less

617 citations

Journal Article•10.1002/2014WR015256•

Image processing of multiphase images obtained via X-ray microtomography: A review

[...]

Steffen Schlüter¹, Adrian Sheppard¹, Kendra I. Brown², Dorthe Wildenschild²•Institutions (2)

Australian National University¹, Oregon State University²

01 Apr 2014-Water Resources Research

TL;DR: In this article, the authors focus on multiclass segmentation and detailed descriptions as to why a specific method may fail together with strategies for preventing the failure by applying suitable image enhancement prior to segmentation.

...read moreread less

Abstract: Easier access to X-ray microtomography (μCT) facilities has provided much new insight from high-resolution imaging for various problems in porous media research. Pore space analysis with respect to functional properties usually requires segmentation of the intensity data into different classes. Image segmentation is a nontrivial problem that may have a profound impact on all subsequent image analyses. This review deals with two issues that are neglected in most of the recent studies on image segmentation: (i) focus on multiclass segmentation and (ii) detailed descriptions as to why a specific method may fail together with strategies for preventing the failure by applying suitable image enhancement prior to segmentation. In this way, the presented algorithms become very robust and are less prone to operator bias. Three different test images are examined: a synthetic image with ground-truth information, a synchrotron image of precision beads with three different fluids residing in the pore space, and a μCT image of a soil sample containing macropores, rocks, organic matter, and the soil matrix. Image blur is identified as the major cause for poor segmentation results. Other impairments of the raw data like noise, ring artifacts, and intensity variation can be removed with current image enhancement methods. Bayesian Markov random field segmentation, watershed segmentation, and converging active contours are well suited for multiclass segmentation, yet with different success to correct for partial volume effects and conserve small image features simultaneously.

...read moreread less

Posted Content•

Robust and Efficient Subspace Segmentation via Least Squares Regression

[...]

Canyi Lu¹, Hai Min¹, Zhong-Qiu Zhao², Lin Zhu¹, De-Shuang Huang³, Shuicheng Yan⁴ - Show less +2 more•Institutions (4)

University of Science and Technology of China¹, Hefei University of Technology², Tongji University³, National University of Singapore⁴

27 Apr 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the Least Squares Regression (LSR) method was proposed for subspace segmentation, which takes advantage of data correlation and encourages a grouping effect which tends to group highly correlated data together.

...read moreread less

Abstract: This paper studies the subspace segmentation problem which aims to segment data drawn from a union of multiple linear subspaces. Recent works by using sparse representation, low rank representation and their extensions attract much attention. If the subspaces from which the data drawn are independent or orthogonal, they are able to obtain a block diagonal affinity matrix, which usually leads to a correct segmentation. The main differences among them are their objective functions. We theoretically show that if the objective function satisfies some conditions, and the data are sufficiently drawn from independent subspaces, the obtained affinity matrix is always block diagonal. Furthermore, the data sampling can be insufficient if the subspaces are orthogonal. Some existing methods are all special cases. Then we present the Least Squares Regression (LSR) method for subspace segmentation. It takes advantage of data correlation, which is common in real data. LSR encourages a grouping effect which tends to group highly correlated data together. Experimental results on the Hopkins 155 database and Extended Yale Database B show that our method significantly outperforms state-of-the-art methods. Beyond segmentation accuracy, all experiments demonstrate that LSR is much more efficient.

...read moreread less

Book Chapter•10.1007/978-3-319-10602-1_49•

Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation

[...]

Koichiro Yamaguchi¹, David McAllester², Raquel Urtasun³•Institutions (3)

Toyota¹, Toyota Technological Institute at Chicago², University of Toronto³

6 Sep 2014

TL;DR: A new optimization algorithm is proposed for the authors' SLIC-like objective which preserves connecteness of image segments and exploits shape regularization in the form of boundary length and can be achieved an order of magnitude faster than competing approaches.

...read moreread less

Abstract: In this paper we propose a slanted plane model for jointly recovering an image segmentation, a dense depth estimate as well as boundary labels (such as occlusion boundaries) from a static scene given two frames of a stereo pair captured from a moving vehicle. Towards this goal we propose a new optimization algorithm for our SLIC-like objective which preserves connecteness of image segments and exploits shape regularization in the form of boundary length. We demonstrate the performance of our approach in the challenging stereo and flow KITTI benchmarks and show superior results to the state-of-the-art. Importantly, these results can be achieved an order of magnitude faster than competing approaches.

...read moreread less

Posted Content•

From Image-level to Pixel-level Labeling with Convolutional Networks

[...]

Pedro O. Pinheiro¹, Ronan Collobert¹•Institutions (1)

Idiap Research Institute¹

23 Nov 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, the authors propose a weakly supervised object segmentation model, which is constrained during training to put more weight on pixels which are important for classifying the image.

...read moreread less

Abstract: We are interested in inferring object segmentation by leveraging only object class information, and by considering only minimal priors on the object segmentation task. This problem could be viewed as a kind of weakly supervised segmentation task, and naturally fits the Multiple Instance Learning (MIL) framework: every training image is known to have (or not) at least one pixel corresponding to the image class label, and the segmentation task can be rewritten as inferring the pixels belonging to the class of the object (given one image, and its object class). We propose a Convolutional Neural Network-based model, which is constrained during training to put more weight on pixels which are important for classifying the image. We show that at test time, the model has learned to discriminate the right pixels well enough, such that it performs very well on an existing segmentation benchmark, by adding only few smoothing priors. Our system is trained using a subset of the Imagenet dataset and the segmentation experiments are performed on the challenging Pascal VOC dataset (with no fine-tuning of the model on Pascal VOC). Our model beats the state of the art results in weakly supervised object segmentation task by a large margin. We also compare the performance of our model with state of the art fully-supervised segmentation approaches.

...read moreread less

Posted Content•

Multi-Atlas Segmentation of Biomedical Images: A Survey

[...]

Juan Eugenio Iglesias, Mert R. Sabuncu¹•Institutions (1)

Harvard University¹

10 Dec 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: A survey of published MAS algorithms and studies that have applied these methods to various biomedical problems and a perspective on the future of MAS, which, it is believed, will be one of the dominant approaches in biomedical image segmentation.

...read moreread less

Abstract: Multi-atlas segmentation (MAS), first introduced and popularized by the pioneering work of Rohlfing, Brandt, Menzel and Maurer Jr (2004), Klein, Mensh, Ghosh, Tourville and Hirsch (2005), and Heckemann, Hajnal, Aljabar, Rueckert and Hammers (2006), is becoming one of the most widely-used and successful image segmentation techniques in biomedical applications. By manipulating and utilizing the entire dataset of "atlases" (training images that have been previously labeled, e.g., manually by an expert), rather than some model-based average representation, MAS has the flexibility to better capture anatomical variation, thus offering superior segmentation accuracy. This benefit, however, typically comes at a high computational cost. Recent advancements in computer hardware and image processing software have been instrumental in addressing this challenge and facilitated the wide adoption of MAS. Today, MAS has come a long way and the approach includes a wide array of sophisticated algorithms that employ ideas from machine learning, probabilistic modeling, optimization, and computer vision, among other fields. This paper presents a survey of published MAS algorithms and studies that have applied these methods to various biomedical problems. In writing this survey, we have three distinct aims. Our primary goal is to document how MAS was originally conceived, later evolved, and now relates to alternative methods. Second, this paper is intended to be a detailed reference of past research activity in MAS, which now spans over a decade (2003 - 2014) and entails novel methodological developments and application-specific solutions. Finally, our goal is to also present a perspective on the future of MAS, which, we believe, will be one of the dominant approaches in biomedical image segmentation.

...read moreread less

Journal Article•10.1016/J.COMPBIOMED.2014.04.014•

A review on segmentation of positron emission tomography images

[...]

Brent Foster¹, Ulas Bagci¹, Awais Mansoor¹, Ziyue Xu¹, Daniel J. Mollura¹ - Show less +1 more•Institutions (1)

National Institutes of Health¹

01 Jul 2014-Computers in Biology and Medicine

TL;DR: This review paper presents state-of-the-art PET image segmentation methods, as well as the recent advances inimage segmentation techniques.

...read moreread less

Journal Article•10.1117/1.JBO.19.1.016007•

Computational segmentation of collagen fibers from second-harmonic generation images of breast cancer.

[...]

Jeremy S. Bredfeldt¹, Yuming Liu¹, Carolyn Pehlke¹, Matthew W. Conklin¹, Joseph M. Szulczewski¹, David R. Inman¹, Patricia J. Keely¹, Robert Nowak¹, Thomas R. Mackie², Thomas R. Mackie¹, Kevin W. Eliceiri¹ - Show less +7 more•Institutions (2)

University of Wisconsin-Madison¹, Morgridge Institute for Research²

01 Jan 2014-Journal of Biomedical Optics

TL;DR: It is found that the curvelet-denoising filter followed by FIRE, a process the authors call CT-FIRE, outperforms the other algorithms under investigation and was successfully applied to track collagen fiber shape changes over time in an in vivo mouse model for breast cancer.

...read moreread less

Abstract: Second-harmonic generation (SHG) imaging can help reveal interactions between collagen fibers and cancer cells. Quantitative analysis of SHG images of collagen fibers is challenged by the heterogeneity of collagen structures and low signal-to-noise ratio often found while imaging collagen in tissue. The role of collagen in breast cancer progression can be assessed post acquisition via enhanced computation. To facilitate this, we have implemented and evaluated four algorithms for extracting fiber information, such as number, length, and curvature, from a variety of SHG images of collagen in breast tissue. The image-processing algorithms included a Gaussian filter, SPIRAL-TV filter, Tubeness filter, and curvelet-denoising filter. Fibers are then extracted using an automated tracking algorithm called fiber extraction (FIRE). We evaluated the algorithm performance by comparing length, angle and position of the automatically extracted fibers with those of manually extracted fibers in twenty-five SHG images of breast cancer. We found that the curvelet-denoising filter followed by FIRE, a process we call CT-FIRE, outperforms the other algorithms under investigation. CT-FIRE was then successfully applied to track collagen fiber shape changes over time in an in vivo mouse model for breast cancer.

...read moreread less

Journal Article•10.1007/S00371-013-0867-4•

SalientShape: group saliency in image collections

[...]

Ming-Ming Cheng¹, Niloy J. Mitra², Xiaolei Huang³, Shi-Min Hu¹•Institutions (3)

Tsinghua University¹, University College London², Lehigh University³

01 Apr 2014-The Visual Computer

TL;DR: This work introduces group saliency to achieve superior unsupervised salient object segmentation by extracting salient objects (in collections of pre-filtered images) that maximize between-image similarities and within-image distinctness.

...read moreread less

Abstract: Efficiently identifying salient objects in large image collections is essential for many applications including image retrieval, surveillance, image annotation, and object recognition. We propose a simple, fast, and effective algorithm for locating and segmenting salient objects by analysing image collections. As a key novelty, we introduce group saliency to achieve superior unsupervised salient object segmentation by extracting salient objects (in collections of pre-filtered images) that maximize between-image similarities and within-image distinctness. To evaluate our method, we construct a large benchmark dataset consisting of 15 K images across multiple categories with 6000+ pixel-accurate ground truth annotations for salient object regions where applicable. In all our tests, group saliency consistently outperforms state-of-the-art single-image saliency algorithms, resulting in both higher precision and better recall. Our algorithm successfully handles image collections, of an order larger than any existing benchmark datasets, consisting of diverse and heterogeneous images from various internet sources.

...read moreread less

Journal Article•10.1016/J.MEDIA.2014.01.010•

Weakly supervised histopathology cancer image segmentation and classification

[...]

Yan Xu¹, Yan Xu², Jun-Yan Zhu³, Eric Chang¹, Maode Lai⁴, Zhuowen Tu⁵ - Show less +2 more•Institutions (5)

Microsoft¹, Beihang University², University of California, Berkeley³, Zhejiang University⁴, University of California, San Diego⁵

01 Apr 2014-Medical Image Analysis

TL;DR: This paper embeds the clustering concept into the multiple instance learning (MIL) setting and derive a principled solution to performing the above three tasks in an integrated framework and introduces contextual constraints as a prior for MCIL, which further reduces the ambiguity in MIL.

...read moreread less

Proceedings Article•10.1109/ICRA.2014.6907236•

Dense 3D semantic mapping of indoor scenes from RGB-D images

[...]

Alexander Hermans¹, Georgios Floros¹, Bastian Leibe¹•Institutions (1)

RWTH Aachen University¹

29 Sep 2014

TL;DR: A novel 2D-3D label transfer based on Bayesian updates and dense pairwise 3D Conditional Random Fields and it is shown that it is not needed to obtain a semantic segmentation for every frame in a sequence in order to create accurate semantic 3D reconstructions.

...read moreread less

Abstract: Dense semantic segmentation of 3D point clouds is a challenging task. Many approaches deal with 2D semantic segmentation and can obtain impressive results. With the availability of cheap RGB-D sensors the field of indoor semantic segmentation has seen a lot of progress. Still it remains unclear how to deal with 3D semantic segmentation in the best way. We propose a novel 2D-3D label transfer based on Bayesian updates and dense pairwise 3D Conditional Random Fields. This approach allows us to use 2D semantic segmentations to create a consistent 3D semantic reconstruction of indoor scenes. To this end, we also propose a fast 2D semantic segmentation approach based on Randomized Decision Forests. Furthermore, we show that it is not needed to obtain a semantic segmentation for every frame in a sequence in order to create accurate semantic 3D reconstructions. We evaluate our approach on both NYU Depth datasets and show that we can obtain a significant speed-up compared to other methods.

...read moreread less

Posted Content•

Fully Convolutional Multi-Class Multiple Instance Learning

[...]

Deepak Pathak¹, Evan Shelhamer², Jonathan Long², Trevor Darrell²•Institutions (2)

Indian Institute of Technology Kanpur¹, University of California, Berkeley²

22 Dec 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, a multi-instance learning (MIL) model is proposed to reduce the need for costly annotation in tasks such as semantic segmentation by weakening the required degree of supervision.

...read moreread less

Abstract: Multiple instance learning (MIL) can reduce the need for costly annotation in tasks such as semantic segmentation by weakening the required degree of supervision. We propose a novel MIL formulation of multi-class semantic segmentation learning by a fully convolutional network. In this setting, we seek to learn a semantic segmentation model from just weak image-level labels. The model is trained end-to-end to jointly optimize the representation while disambiguating the pixel-image label assignment. Fully convolutional training accepts inputs of any size, does not need object proposal pre-processing, and offers a pixelwise loss map for selecting latent instances. Our multi-class MIL loss exploits the further supervision given by images with multiple labels. We evaluate this approach through preliminary experiments on the PASCAL VOC segmentation challenge.

...read moreread less

Book Chapter•10.1007/978-3-319-10599-4_45•

Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

[...]

Abhijit Kundu¹, Yin Li¹, Frank Dellaert¹, Fuxin Li¹, James M. Rehg¹ - Show less +1 more•Institutions (1)

Georgia Institute of Technology¹

6 Sep 2014

TL;DR: Improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences is demonstrated.

...read moreread less

Abstract: We present an approach for joint inference of 3D scene structure and semantic labeling for monocular video. Starting with monocular image stream, our framework produces a 3D volumetric semantic + occupancy map, which is much more useful than a series of 2D semantic label images or a sparse point cloud produced by traditional semantic segmentation and Structure from Motion(SfM) pipelines respectively. We derive a Conditional Random Field (CRF) model defined in the 3D space, that jointly infers the semantic category and occupancy for each voxel. Such a joint inference in the 3D CRF paves the way for more informed priors and constraints, which is otherwise not possible if solved separately in their traditional frameworks. We make use of class specific semantic cues that constrain the 3D structure in areas, where multiview constraints are weak. Our model comprises of higher order factors, which helps when the depth is unobservable.We also make use of class specific semantic cues to reduce either the degree of such higher order factors, or to approximately model them with unaries if possible. We demonstrate improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences.

...read moreread less

Journal Article•10.1016/J.IMAVIS.2014.04.002•

Covariance descriptor based on bio-inspired features for person re-identification and face verification

[...]

Bingpeng Ma, Yu Su¹, Frédéric Jurie¹•Institutions (1)

University of Caen Lower Normandy¹

01 Jun 2014-Image and Vision Computing

TL;DR: Avoiding the use of complicated pre-processing steps such as accurate face and body part segmentation or image normalization, this paper proposes a novel face/person image representation which can properly handle background and illumination variations called gBiCov.

...read moreread less

Proceedings Article•10.1109/CVPR.2014.482•

Robust Subspace Segmentation with Block-Diagonal Prior

[...]

Jiashi Feng¹, Zhouchen Lin², Huan Xu¹, Shuicheng Yan¹•Institutions (2)

National University of Singapore¹, Peking University²

23 Jun 2014

TL;DR: The subspace segmentation problem is addressed by effectively constructing an exactly block-diagonal sample affinity matrix by proposing a graph Laplacian constraint based formulation, and developing an efficient stochastic subgradient algorithm for optimization.

...read moreread less

Abstract: The subspace segmentation problem is addressed in this paper by effectively constructing an exactly block-diagonal sample affinity matrix. The block-diagonal structure is heavily desired for accurate sample clustering but is rather difficult to obtain. Most current state-of-the-art subspace segmentation methods (such as SSC[4] and LRR[12]) resort to alternative structural priors (such as sparseness and low-rankness) to construct the affinity matrix. In this work, we directly pursue the block-diagonal structure by proposing a graph Laplacian constraint based formulation, and then develop an efficient stochastic subgradient algorithm for optimization. Moreover, two new subspace segmentation methods, the block-diagonal SSC and LRR, are devised in this work. To the best of our knowledge, this is the first research attempt to explicitly pursue such a block-diagonal structure. Extensive experiments on face clustering, motion segmentation and graph construction for semi-supervised learning clearly demonstrate the superiority of our novelly proposed subspace segmentation methods.

...read moreread less

Book Chapter•10.1007/978-3-319-10578-9_52•

Crisp Boundary Detection Using Pointwise Mutual Information

[...]

Phillip Isola¹, Daniel Zoran¹, Dilip Krishnan¹, Edward H. Adelson¹•Institutions (1)

Massachusetts Institute of Technology¹

6 Sep 2014

TL;DR: This paper shows how to derive an affinity measure based on a simple underlying principle using pointwise mutual information, and shows that this measure is indeed a good predictor of whether or not two pixels reside on the same object.

...read moreread less

Abstract: Detecting boundaries between semantically meaningful objects in visual scenes is an important component of many vision algorithms. In this paper, we propose a novel method for detecting such boundaries based on a simple underlying principle: pixels belonging to the same object exhibit higher statistical dependencies than pixels belonging to different objects. We show how to derive an affinity measure based on this principle using pointwise mutual information, and we show that this measure is indeed a good predictor of whether or not two pixels reside on the same object. Using this affinity with spectral clustering, we can find object boundaries in the image – achieving state-of-the-art results on the BSDS500 dataset. Our method produces pixel-level accurate boundaries while requiring minimal feature engineering.

...read moreread less

Journal Article•10.1109/TMM.2013.2293424•

Representative Discovery of Structure Cues for Weakly-Supervised Image Segmentation

[...]

Luming Zhang¹, Yue Gao¹, Yingjie Xia², Ke Lu³, Jialie Shen, Rongrong Ji⁴ - Show less +2 more•Institutions (4)

National University of Singapore¹, Hangzhou Normal University², Chinese Academy of Sciences³, Xiamen University⁴

01 Feb 2014-IEEE Transactions on Multimedia

TL;DR: Experimental results show that the proposed approach outperforms state-of-the-art weakly-supervised image segmentation methods, on five popular segmentation data sets and performs competitively to the fully- supervised segmentation models.

...read moreread less

Abstract: Weakly-supervised image segmentation is a challenging problem with multidisciplinary applications in multimedia content analysis and beyond. It aims to segment an image by leveraging its image-level semantics (i.e., tags). This paper presents a weakly-supervised image segmentation algorithm that learns the distribution of spatially structural superpixel sets from image-level labels. More specifically, we first extract graphlets from a given image, which are small-sized graphs consisting of superpixels and encapsulating their spatial structure. Then, an efficient manifold embedding algorithm is proposed to transfer labels from training images into graphlets. It is further observed that there are numerous redundant graphlets that are not discriminative to semantic categories, which are abandoned by a graphlet selection scheme as they make no contribution to the subsequent segmentation. Thereafter, we use a Gaussian mixture model (GMM) to learn the distribution of the selected post-embedding graphlets (i.e., vectors output from the graphlet embedding). Finally, we propose an image segmentation algorithm, termed representative graphlet cut, which leverages the learned GMM prior to measure the structure homogeneity of a test image. Experimental results show that the proposed approach outperforms state-of-the-art weakly-supervised image segmentation methods, on five popular segmentation data sets. Besides, our approach performs competitively to the fully-supervised segmentation models.

...read moreread less

Journal Article•10.1016/J.ISPRSJPRS.2014.07.002•

Comparing supervised and unsupervised multiresolution segmentation approaches for extracting buildings from very high resolution imagery

[...]

Mariana Belgiu¹, Lucian Drǎguţ²•Institutions (2)

University of Salzburg¹, West University of Timișoara²

01 Oct 2014-Isprs Journal of Photogrammetry and Remote Sensing

TL;DR: The results of this study suggest that object-based image analysis can be automated without sacrificing classification accuracy, and that the previously accepted idea that classification is dependent on segmentation is challenged, casting doubt on the value of pursuing ‘optimal segmentation’.

...read moreread less

Abstract: Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing ‘optimal segmentation’. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.

...read moreread less

Journal Article•10.1016/J.AUTCON.2014.02.021•

Productive modeling for development of as-built BIM of existing indoor structures

[...]

Jaehoon Jung¹, Sungchul Hong¹, Seongsu Jeong², Sangmin Kim¹, Hyoungsig Cho¹, Seunghwan Hong¹, Joon Heo¹ - Show less +3 more•Institutions (2)

Yonsei University¹, Ohio State University²

01 Jun 2014-Automation in Construction

TL;DR: In this paper, a semi-automatic methodology for improved productivity of as-built building information model (BIM) creation with respect to large and complex indoor environments is proposed, which produces 3D geometric drawings through three steps: segmentation for plane extraction, refinement for removal of noisy points, and boundary tracing for outline extraction.

...read moreread less

Journal Article•10.12720/JOIG.1.4.166-170•

Image Segmentation Techniques: A Survey

[...]

Waseem Khan

01 Jan 2014-Journal of Image and Graphics

TL;DR: This survey addressed various image segmentation techniques, evaluates them and presents the issues related to those techniques.

...read moreread less

Abstract: Image segmentation is a mechanism used to divide an image into multiple segments. It will make image smooth and easy to evaluate. Segmentation process also helps to find region of interest in a particular image. The main goal is to make image more simple and meaningful. Existing segmentation techniques can't satisfy all type of images. This survey addressed various image segmentation techniques, evaluates them and presents the issues related to those techniques. 

...read moreread less

Journal Article•10.1523/JNEUROSCI.3414-13.2014•

Hippocampal Replay Captures the Unique Topological Structure of a Novel Environment

[...]

Xiaojing Wu¹, David J. Foster¹•Institutions (1)

Johns Hopkins University School of Medicine¹

07 May 2014-The Journal of Neuroscience

TL;DR: The results suggest that hippocampal replay captures learned information about environmental topology to support a role in navigation.

...read moreread less

Abstract: Hippocampal place-cell replay has been proposed as a fundamental mechanism of learning and memory, which might support navigational learning and planning. An important hypothesis of relevance to these proposed functions is that the information encoded in replay should reflect the topological structure of experienced environments; that is, which places in the environment are connected with which others. Here we report several attributes of replay observed in rats exploring a novel forked environment that support the hypothesis. First, we observed that overlapping replays depicting divergent trajectories through the fork recruited the same population of cells with the same firing rates to represent the common portion of the trajectories. Second, replay tended to be directional and to flip the represented direction at the fork. Third, replay-associated sharp-wave–ripple events in the local field potential exhibited substructure that mapped onto the maze topology. Thus, the spatial complexity of our recording environment was accurately captured by replay: the underlying neuronal activities reflected the bifurcating shape, and both directionality and associated ripple structure reflected the segmentation of the maze. Finally, we observed that replays occurred rapidly after small numbers of experiences. Our results suggest that hippocampal replay captures learned information about environmental topology to support a role in navigation.

...read moreread less

...

Expand