TL;DR: Results show that learning the optimum kernel combination of multiple features vastly improves the performance, from 55.1% for the best single feature to 72.8% forThe combination of all features.
Abstract: We investigate to what extent combinations of features can improve classification performance on a large dataset of similar classes. To this end we introduce a 103 class flower dataset. We compute four different features for the flowers, each describing different aspects, namely the local shape/texture, the shape of the boundary, the overall spatial distribution of petals, and the colour. We combine the features using a multiple kernel framework with a SVM classifier. The weights for each class are learnt using the method of Varma and Ray, which has achieved state of the art performance on other large dataset, such as Caltech 101/256. Our dataset has a similar challenge in the number of classes, but with the added difficulty of large between class similarity and small within class similarity. Results show that learning the optimum kernel combination of multiple features vastly improves the performance, from 55.1% for the best single feature to 72.8% for the combination of all features.
TL;DR: A privacy-preserving system for estimating the size of inhomogeneous crowds, composed of pedestrians that travel in different directions, without using explicit object segmentation or tracking is presented.
Abstract: We present a privacy-preserving system for estimating the size of inhomogeneous crowds, composed of pedestrians that travel in different directions, without using explicit object segmentation or tracking. First, the crowd is segmented into components of homogeneous motion, using the mixture of dynamic textures motion model. Second, a set of simple holistic features is extracted from each segmented region, and the correspondence between features and the number of people per segment is learned with Gaussian process regression. We validate both the crowd segmentation algorithm, and the crowd counting system, on a large pedestrian dataset (2000 frames of video, containing 49,885 total pedestrian instances). Finally, we present results of the system running on a full hour of video.
TL;DR: This work proposes an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion that works well on sparse, noisy point clouds, and unlike existing approaches, does not need appearance-based descriptors.
Abstract: We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. We motivate five simple cues designed to model specific patterns of motion and 3D world structure that vary with object category. We introduce features that project the 3D cues back to the 2D image plane while modeling spatial layout and context. A randomized decision forest combines many such features to achieve a coherent 2D segmentation and recognize the object categories present. Our main contribution is to show how semantic segmentation is possible based solely on motion-derived 3D world structure. Our method works well on sparse, noisy point clouds, and unlike existing approaches, does not need appearance-based descriptors.
Experiments were performed on a challenging new video database containing sequences filmed from a moving car in daylight and at dusk. The results confirm that indeed, accurate segmentation and recognition are possible using only motion and 3D world structure. Further, we show that the motion-derived information complements an existing state-of-the-art appearance-based method, improving both qualitative and quantitative performance.
TL;DR: The proposed semantic texton forests are ensembles of decision trees that act directly on image pixels, and therefore do not need the expensive computation of filter-bank responses or local descriptors, and give at least a five-fold increase in execution speed.
Abstract: We propose semantic texton forests, efficient and powerful new low-level features. These are ensembles of decision trees that act directly on image pixels, and therefore do not need the expensive computation of filter-bank responses or local descriptors. They are extremely fast to both train and test, especially compared with k-means clustering and nearest-neighbor assignment of feature descriptors. The nodes in the trees provide (i) an implicit hierarchical clustering into semantic textons, and (ii) an explicit local classification estimate. Our second contribution, the bag of semantic textons, combines a histogram of semantic textons over an image region with a region prior category distribution. The bag of semantic textons is computed over the whole image for categorization, and over local rectangular regions for segmentation. Including both histogram and region prior allows our segmentation algorithm to exploit both textural and semantic context. Our third contribution is an image-level prior for segmentation that emphasizes those categories that the automatic categorization believes to be present. We evaluate on two datasets including the very challenging VOC 2007 segmentation dataset. Our results significantly advance the state-of-the-art in segmentation accuracy, and furthermore, our use of efficient decision forests gives at least a five-fold increase in execution speed.
TL;DR: A novel method for detecting and localizing objects of a visual category in cluttered real-world scenes that is applicable to a range of different object categories, including both rigid and articulated objects and able to achieve competitive object detection performance from training sets that are between one and two orders of magnitude smaller than those used in comparable systems.
Abstract: This paper presents a novel method for detecting and localizing objects of a visual category in cluttered real-world scenes. Our approach considers object categorization and figure-ground segmentation as two interleaved processes that closely collaborate towards a common goal. As shown in our work, the tight coupling between those two processes allows them to benefit from each other and improve the combined performance.
The core part of our approach is a highly flexible learned representation for object shape that can combine the information observed on different training examples in a probabilistic extension of the Generalized Hough Transform. The resulting approach can detect categorical objects in novel images and automatically infer a probabilistic segmentation from the recognition result. This segmentation is then in turn used to again improve recognition by allowing the system to focus its efforts on object pixels and to discard misleading influences from the background. Moreover, the information from where in the image a hypothesis draws its support is employed in an MDL based hypothesis verification stage to resolve ambiguities between overlapping hypotheses and factor out the effects of partial occlusion.
An extensive evaluation on several large data sets shows that the proposed system is applicable to a range of different object categories, including both rigid and articulated objects. In addition, its flexible representation allows it to achieve competitive object detection performance already from training sets that are between one and two orders of magnitude smaller than those used in comparable systems.
TL;DR: An extensive evaluation of the unsupervised objective evaluation methods that have been proposed in the literature are presented and the advantages and shortcomings of the underlying design mechanisms in these methods are discussed and analyzed.
TL;DR: This paper proposes a novel framework for labelling problems which is able to combine multiple segmentations in a principled manner based on higher order conditional random fields and uses potentials defined on sets of pixels generated using unsupervised segmentation algorithms.
Abstract: This paper proposes a novel framework for labelling problems which is able to combine multiple segmentations in a principled manner. Our method is based on higher order conditional random fields and uses potentials defined on sets of pixels (image segments) generated using unsupervised segmentation algorithms. These potentials enforce label consistency in image regions and can be seen as a strict generalization of the commonly used pairwise contrast sensitive smoothness potentials. The higher order potential functions used in our framework take the form of the robust Pn model. This enables the use of powerful graph cut based move making algorithms for performing inference in the framework [14 ]. We test our method on the problem of multi-class object segmentation by augmenting the conventional CRF used for object segmentation with higher order potentials defined on image regions. Experiments on challenging data sets show that integration of higher order potentials quantitatively and qualitatively improves results leading to much better definition of object boundaries. We believe that this method can be used to yield similar improvements for many other labelling problems.
TL;DR: An automatic four-chamber heart segmentation system for the quantitative functional analysis of the heart from cardiac computed tomography (CT) volumes is proposed and an efficient and robust approach for automatic heart chamber segmentation in 3D CT volumes is developed.
Abstract: We propose an automatic four-chamber heart segmentation system for the quantitative functional analysis of the heart from cardiac computed tomography (CT) volumes. Two topics are discussed: heart modeling and automatic model fitting to an unseen volume. Heart modeling is a nontrivial task since the heart is a complex nonrigid organ. The model must be anatomically accurate, allow manual editing, and provide sufficient information to guide automatic detection and segmentation. Unlike previous work, we explicitly represent important landmarks (such as the valves and the ventricular septum cusps) among the control points of the model. The control points can be detected reliably to guide the automatic model fitting process. Using this model, we develop an efficient and robust approach for automatic heart chamber segmentation in 3D CT volumes. We formulate the segmentation as a two-step learning problem: anatomical structure localization and boundary delineation. In both steps, we exploit the recent advances in learning discriminative models. A novel algorithm, marginal space learning (MSL), is introduced to solve the 9-D similarity transformation search problem for localizing the heart chambers. After determining the pose of the heart chambers, we estimate the 3D shape through learning-based boundary delineation. The proposed method has been extensively tested on the largest dataset (with 323 volumes from 137 patients) ever reported in the literature. To the best of our knowledge, our system is the fastest with a speed of 4.0 s per volume (on a dual-core 3.2-GHz processor) for the automatic segmentation of all four chambers.
TL;DR: The proposed network model, coordinating the physical body of a humanoid robot through high-dimensional sensori-motor control, also successfully situated itself within a physical environment and suggests that it is not only the spatial connections between neurons but also the timescales of neural activity that act as important mechanisms leading to functional hierarchy in neural systems.
Abstract: It is generally thought that skilled behavior in human beings results from a functional hierarchy of the motor control system, within which reusable motor primitives are flexibly integrated into various sensori-motor sequence patterns. The underlying neural mechanisms governing the way in which continuous sensori-motor flows are segmented into primitives and the way in which series of primitives are integrated into various behavior sequences have, however, not yet been clarified. In earlier studies, this functional hierarchy has been realized through the use of explicit hierarchical structure, with local modules representing motor primitives in the lower level and a higher module representing sequences of primitives switched via additional mechanisms such as gate-selecting. When sequences contain similarities and overlap, however, a conflict arises in such earlier models between generalization and segmentation, induced by this separated modular structure. To address this issue, we propose a different type of neural network model. The current model neither makes use of separate local modules to represent primitives nor introduces explicit hierarchical structure. Rather than forcing architectural hierarchy onto the system, functional hierarchy emerges through a form of self-organization that is based on two distinct types of neurons, each with different time properties (“multiple timescales”). Through the introduction of multiple timescales, continuous sequences of behavior are segmented into reusable primitives, and the primitives, in turn, are flexibly integrated into novel sequences. In experiments, the proposed network model, coordinating the physical body of a humanoid robot through high-dimensional sensori-motor control, also successfully situated itself within a physical environment. Our results suggest that it is not only the spatial connections between neurons but also the timescales of neural activity that act as important mechanisms leading to functional hierarchy in neural systems.
TL;DR: A novel method to address the problem of estimating the number of people in surveillance scenes with people gathering and waiting by combining a MID based foreground segmentation algorithm and a HOG based head-shoulder detection algorithm to provide an accurate estimation of people counts in the observed area.
Abstract: This paper proposes a novel method to address the problem of estimating the number of people in surveillance scenes with people gathering and waiting. The proposed method combines a MID (mosaic image difference) based foreground segmentation algorithm and a HOG (histograms of oriented gradients) based head-shoulder detection algorithm to provide an accurate estimation of people counts in the observed area. In our framework, the MID-based foreground segmentation module provides active areas for the head-shoulder detection module to detect heads and count the number of people. Numerous experiments are conducted and convincing results demonstrate the effectiveness of our method.
TL;DR: In this article, the authors demonstrate the utility of dendrograms at representing the essential features of the hierarchical structure of the isosurfaces for molecular line data cubes and demonstrate that constructing the dendrogram of CO -->(J = 1? 0) emission from the Orion-Monoceros region allows for the identification of giant molecular clouds in a blended molecular-line data set using only a physically motivated definition (self-gravitating clouds with masses > 5? 104 M?).
Abstract: We demonstrate the utility of dendrograms at representing the essential features of the hierarchical structure of the isosurfaces for molecular line data cubes. The dendrogram of a data cube is an abstraction of the changing topology of the isosurfaces as a function of contour level. The ability to track hierarchical structure over a range of scales makes this analysis philosophically different from local segmentation algorithms like CLUMPFIND. Points in the dendrogram structure correspond to specific volumes in data cubes defined by their bounding isosurfaces. We further refine the technique by measuring the properties associated with each isosurface in the analysis allowing for a multiscale calculation of molecular gas properties. Using COMPLETE13CO -->(J = 1? 0) data from the L1448 region in Perseus and mock observations of a simulated data cube, we identify regions that have a significant contribution by self-gravity to their energetics on a range of scales. We find evidence for self-gravitation on all spatial scales in L1448, although not in all regions. In the simulated observations, nearly all of the emission is found in objects that would be self-gravitating if gravity were included in the simulation. We reconstruct the size-line-width relationship within the data cube using the dendrogram-derived properties and find it follows the standard relation: -->?v R0.58. Finally, we show that constructing the dendrogram of CO -->(J = 1? 0) emission from the Orion-Monoceros region allows for the identification of giant molecular clouds in a blended molecular line data set using only a physically motivated definition (self-gravitating clouds with masses > -->5 ? 104 M?).
TL;DR: In this paper, a Bayesian formulation for incorporating soft model assignments into the calculation of affinities is presented. And the resulting soft model assignment is integrated into the multilevel segmentation by weighted aggregation algorithm, and applied to the task of detecting and segmenting brain tumor and edema in multichannel magnetic resonance (MR) volumes.
Abstract: We present a new method for automatic segmentation of heterogeneous image data that takes a step toward bridging the gap between bottom-up affinity-based segmentation methods and top-down generative model based approaches. The main contribution of the paper is a Bayesian formulation for incorporating soft model assignments into the calculation of affinities, which are conventionally model free. We integrate the resulting model-aware affinities into the multilevel segmentation by weighted aggregation algorithm, and apply the technique to the task of detecting and segmenting brain tumor and edema in multichannel magnetic resonance (MR) volumes. The computationally efficient method runs orders of magnitude faster than current state-of-the-art techniques giving comparable or improved results. Our quantitative results indicate the benefit of incorporating model-aware affinities into the segmentation process for the difficult case of glioblastoma multiforme brain tumor.
TL;DR: An automatic method for delineating the prostate in three-dimensional magnetic resonance scans is presented, based on nonrigid registration of a set of prelabeled atlas images, and the segmentation quality is especially good at the prostate-rectum interface.
Abstract: An automatic method for delineating the prostate (including the seminal vesicles) in three-dimensional magnetic resonance scans is presented. The method is based on nonrigid registration of a set of prelabeled atlas images. Each atlas image is nonrigidly registered with the target patient image. Subsequently, the deformed atlas label images are fused to yield a single segmentation of the patient image. The proposed method is evaluated on 50 clinical scans, which were manually segmented by three experts. The Dice similarity coefficient (DSC) is used to quantify the overlap between the automatic and manual segmentations. We investigate the impact of several factors on the performance of the segmentation method. For the registration, two similarity measures are compared: Mutual information and a localized version of mutual information. The latter turns out to be superior (median DeltaDSC approximately equal 0.02, p 0.05). To assess the influence of the atlas composition, two atlas sets are compared. The first set consists of 38 scans of healthy volunteers. The second set is constructed by a leave-one-out approach using the 50 clinical scans that are used for evaluation. The second atlas set gives substantially better performance (DeltaDSC=0.04, p < 0.01), stressing the importance of a careful atlas definition. With the best settings, a median DSC of around 0.85 is achieved, which is close to the median interobserver DSC of 0.87. The segmentation quality is especially good at the prostate-rectum interface, where the segmentation error remains below 1 mm in 50% of the cases and below 1.5 mm in 75% of the cases.
TL;DR: The model-based approach for the fully automatic segmentation of the whole heart (four chambers, myocardium, and great vessels) from 3-D CT images shows better interphase and interpatient shape variability characterization than commonly used principal component analysis.
Abstract: Automatic image processing methods are a pre-requisite to efficiently analyze the large amount of image data produced by computed tomography (CT) scanners during cardiac exams. This paper introduces a model-based approach for the fully automatic segmentation of the whole heart (four chambers, myocardium, and great vessels) from 3-D CT images. Model adaptation is done by progressively increasing the degrees-of-freedom of the allowed deformations. This improves convergence as well as segmentation accuracy. The heart is first localized in the image using a 3-D implementation of the generalized Hough transform. Pose misalignment is corrected by matching the model to the image making use of a global similarity transformation. The complex initialization of the multicompartment mesh is then addressed by assigning an affine transformation to each anatomical region of the model. Finally, a deformable adaptation is performed to accurately match the boundaries of the patient's anatomy. A mean surface-to-surface error of 0.82 mm was measured in a leave-one-out quantitative validation carried out on 28 images. Moreover, the piecewise affine transformation introduced for mesh initialization and adaptation shows better interphase and interpatient shape variability characterization than commonly used principal component analysis.
TL;DR: A method is developed, based on established algorithms, for automatic segmentation of young children's brains into 83 regions of interest (ROIs), and applied to an exemplar group of 33 2-year-old subjects who had been born prematurely.
TL;DR: The utility of the glandular and nuclear segmentation algorithm in accurate extraction of various morphological and nuclear features for automated grading of prostate cancer, breast cancer, and breast cancer specimens is demonstrated by distinguishing between cancerous and benign breast histology specimens.
Abstract: Automated detection and segmentation of nuclear and glandular structures is critical for classification and grading of prostate and breast cancer histopathology. In this paper, we present a methodology for automated detection and segmentation of structures of interest in digitized histopathology images. The scheme integrates image information from across three different scales: (1) low- level information based on pixel values, (2) high-level information based on relationships between pixels for object detection, and (3) domain-specific information based on relationships between histological structures. Low-level information is utilized by a Bayesian classifier to generate a likelihood that each pixel belongs to an object of interest. High-level information is extracted in two ways: (i) by a level-set algorithm, where a contour is evolved in the likelihood scenes generated by the Bayesian classifier to identify object boundaries, and (ii) by a template matching algorithm, where shape models are used to identify glands and nuclei from the low-level likelihood scenes. Structural constraints are imposed via domain- specific knowledge in order to verify whether the detected objects do indeed belong to structures of interest. In this paper we demonstrate the utility of our glandular and nuclear segmentation algorithm in accurate extraction of various morphological and nuclear features for automated grading of (a) prostate cancer, (b) breast cancer, and (c) distinguishing between cancerous and benign breast histology specimens. The efficacy of our segmentation algorithm is evaluated by comparing breast and prostate cancer grading and benign vs. cancer discrimination accuracies with corresponding accuracies obtained via manual detection and segmentation of glands and nuclei.
TL;DR: This paper shows how to implement a star shape prior into graph cut segmentation, a generic shape prior that applies to a wide class of objects, in particular to convex objects, and shows that in many cases, it can achieve an accurate object segmentation with only a single pixel, the center of the object, provided by the user, which is rarely possible with standard graph cut interactive segmentation.
Abstract: In recent years, segmentation with graph cuts is increasingly used for a variety of applications, such as photo/video editing, medical image processing, etc. One of the most common applications of graph cut segmentation is extracting an object of interest from its background. If there is any knowledge about the object shape (i.e. a shape prior), incorporating this knowledge helps to achieve a more robust segmentation. In this paper, we show how to implement a star shape prior into graph cut segmentation. This is a generic shape prior, i.e. it is not specific to any particular object, but rather applies to a wide class of objects, in particular to convex objects. Our major assumption is that the center of the star shape is known, for example, it can be provided by the user. The star shape prior has an additional important benefit - it allows an inclusion of a term in the objective function which encourages a longer object boundary. This helps to alleviate the bias of a graph cut towards shorter segmentation boundaries. In fact, we show that in many cases, with this new term we can achieve an accurate object segmentation with only a single pixel, the center of the object, provided by the user, which is rarely possible with standard graph cut interactive segmentation.
TL;DR: This work augments the generative model used for combined segmentation and normalization of images, with an empirical prior for an atypical tissue class, which can be optimised iteratively and adopts a fuzzy clustering procedure to identify outlier voxels in normalised gray and white matter segments.
TL;DR: In this article, an object-oriented approach for analyzing and characterizing the urban landscape structure at the parcel level using high-resolution digital aerial imagery and LIght Detection and Ranging (LIDAR) data was presented.
Abstract: This paper presents an object-oriented approach for analysing and characterizing the urban landscape structure at the parcel level using high-resolution digital aerial imagery and LIght Detection and Ranging (LIDAR) data. Additional spatial datasets including property parcel boundaries and building footprints were used to both facilitate object segmentation and obtain greater classification accuracy. The study area is the Gwynns Falls watershed, which includes portions of Baltimore City and Baltimore County, MD. A three-level hierarchical network of image objects was generated, and objects were classified. At the two lower levels, objects were classified into five classes, building, pavement, bare soil, fine textured vegetation and coarse textured vegetation, respectively. The object-oriented classification approach proved to be effective for urban land cover classification. The overall accuracy of the classification was 92.3%, and the overall Kappa statistic was 0.899. Land cover proportions as well as vegetation characteristics were then summarized by property parcel. This exercise resulted in a knowledge base of rules for urban land cover classification, which could potentially be applied to other urban areas.
TL;DR: Two-dimensional Otsu method behaves well in segmenting images of low signal-to-noise ratio than one-dimensional (1D), but it gives satisfactory results only when the numbers of pixels in each class are close to each other.
Abstract: Image segmentation plays an important role in image analysis and computer vision system. Among all segmentation techniques, the automatic thresholding methods are widely used because of their advantages of simple implement and time saving. Otsu method is one of thresholding methods and frequently used in various fields. Two-dimensional (2D) Otsu method behaves well in segmenting images of low signal-to-noise ratio than one-dimensional (1D). But it gives satisfactory results only when the numbers of pixels in each class are close to each other. Otherwise, it gives the improper results. In this paper, 2D histogram projection is used to correct the Otsu threshold. The 1D histograms are acquired by 2D histogram projection in x and y axes and a fast algorithm for searching the extrema of the projected histogram is proposed based on the wavelet transform in this paper. Experimental results show that the proposed method performs better than the traditional Otsu method for our renal biopsy samples.
TL;DR: The robustness of the method proposed can be seen in the high percentage of images obtained with a discrepancy delta<5 and the results confirm the hypothesis that the ONH contour can be properly approached with a non-deformable ellipse.
TL;DR: An efficient algorithm for segmenting different types of pulmonary nodules including high and low contrast nodules, nodules with vasculature attachment, and nodules in the close vicinity of the lung wall or diaphragm is presented.
Abstract: This paper presents an efficient algorithm for segmenting different types of pulmonary nodules including high and low contrast nodules, nodules with vasculature attachment, and nodules in the close vicinity of the lung wall or diaphragm. The algorithm performs an adaptive sphericity oriented contrast region growing on the fuzzy connectivity map of the object of interest. This region growing is operated within a volumetric mask which is created by first applying a local adaptive segmentation algorithm that identifies foreground and background regions within a certain window size. The foreground objects are then filled to remove any holes, and a spatial connectivity map is generated to create a 3-D mask. The mask is then enlarged to contain the background while excluding unwanted foreground regions. Apart from generating a confined search volume, the mask is also used to estimate the parameters for the subsequent region growing, as well as for repositioning the seed point in order to ensure reproducibility. The method was run on 815 pulmonary nodules. By using randomly placed seed points, the approach was shown to be fully reproducible. As for acceptability, the segmentation results were visually inspected by a qualified radiologist to search for any gross misssegmentation. 84% of the first results of the segmentation were accepted by the radiologist while for the remaining 16% nodules, alternative segmentation solutions that were provided by the method were selected.
TL;DR: A comparative study to review eight different deformable contour methods (DCMs) of snakes and level set methods applied to the medical image segmentation is presented in this article, which highlights both the strengths and limitations of these methods.
TL;DR: A novel Bayesian approach to unsupervised topic segmentation is described, showing that lexical cohesion can be placed in a Bayesian context by modeling the words in each topic segment as draws from a multinomial language model associated with the segment; maximizing the observation likelihood in such a model yields a lexically-cohesive segmentation.
Abstract: This paper describes a novel Bayesian approach to unsupervised topic segmentation. Unsupervised systems for this task are driven by lexical cohesion: the tendency of well-formed segments to induce a compact and consistent lexical distribution. We show that lexical cohesion can be placed in a Bayesian context by modeling the words in each topic segment as draws from a multinomial language model associated with the segment; maximizing the observation likelihood in such a model yields a lexically-cohesive segmentation. This contrasts with previous approaches, which relied on hand-crafted cohesion metrics. The Bayesian framework provides a principled way to incorporate additional features such as cue phrases, a powerful indicator of discourse structure that has not been previously used in unsupervised segmentation systems. Our model yields consistent improvements over an array of state-of-the-art systems on both text and speech datasets. We also show that both an entropy-based analysis and a well-known previous technique can be derived as special cases of the Bayesian framework.
TL;DR: A robust subspace separation scheme that can deal with all of these practical issues in a unified framework and draw strong connections between lossy compression, rank minimization, and sparse representation is developed.
Abstract: We examine the problem of segmenting tracked feature point trajectories of multiple moving objects in an image sequence. Using the affine camera model, this motion segmentation problem can be cast as the problem of segmenting samples drawn from a union of linear subspaces. Due to limitations of the tracker, occlusions and the presence of nonrigid objects in the scene, the obtained motion trajectories may contain grossly mistracked features, missing entries, or not correspond to any valid motion model. In this paper, we develop a robust subspace separation scheme that can deal with all of these practical issues in a unified framework. Our methods draw strong connections between lossy compression, rank minimization, and sparse representation. We test our methods extensively and compare their performance to several extant methods with experiments on the Hopkins 155 database. Our results are on par with state-of-the-art results, and in many cases exceed them. All MATLAB code and segmentation results are publicly available for peer evaluation at http://perception.csl.uiuc.edu/coding/motion/.
TL;DR: Novel methods for automatic object detection in high-resolution images by combining spectral information with structural information exploited by using image segmentation are presented.
Abstract: The object-based analysis of remotely sensed imagery provides valuable spatial and structural information that is complementary to pixel-based spectral information in classification. In this paper, we present novel methods for automatic object detection in high-resolution images by combining spectral information with structural information exploited by using image segmentation. The proposed segmentation algorithm uses morphological operations applied to individual spectral bands using structuring elements in increasing sizes. These operations produce a set of connected components forming a hierarchy of segments for each band. A generic algorithm is designed to select meaningful segments that maximize a measure consisting of spectral homogeneity and neighborhood connectivity. Given the observation that different structures appear more clearly at different scales in different spectral bands, we describe a new algorithm for unsupervised grouping of candidate segments belonging to multiple hierarchical segmentations to find coherent sets of segments that correspond to actual objects. The segments are modeled by using their spectral and textural content, and the grouping problem is solved by using the probabilistic latent semantic analysis algorithm that builds object models by learning the object-conditional probability distributions. The automatic labeling of a segment is done by computing the similarity of its feature distribution to the distribution of the learned object models using the Kullback-Leibler divergence. The performances of the unsupervised segmentation and object detection algorithms are evaluated qualitatively and quantitatively using three different data sets with comparative experiments, and the results show that the proposed methods are able to automatically detect, group, and label segments belonging to the same object classes.
TL;DR: A new, simple, and efficient segmentation approach, based on a fusion procedure which aims at combining several segmentation maps associated to simpler partition models in order to finally get a more reliable and accurate segmentation result.
Abstract: This paper presents a new, simple, and efficient segmentation approach, based on a fusion procedure which aims at combining several segmentation maps associated to simpler partition models in order to finally get a more reliable and accurate segmentation result. The different label fields to be fused in our application are given by the same and simple (K-means based) clustering technique on an input image expressed in different color spaces. Our fusion strategy aims at combining these segmentation maps with a final clustering procedure using as input features, the local histogram of the class labels, previously estimated and associated to each site and for all these initial partitions. This fusion framework remains simple to implement, fast, general enough to be applied to various computer vision applications (e.g., motion detection and segmentation), and has been successfully applied on the Berkeley image database. The experiments herein reported in this paper illustrate the potential of this approach compared to the state-of-the-art segmentation methods recently proposed in the literature.
TL;DR: A modified FCM algorithm (called mFCM later) for MRI brain image segmentation is presented, realized by incorporating the spatial neighborhood information into the standardFCM algorithm and modifying the membership weighting of each cluster.
TL;DR: The proposed IRGS method provides the possibility of building a hierarchical representation of the image content, and allows various region features and even domain knowledge to be incorporated in the segmentation process.
Abstract: This paper proposes an image segmentation method named iterative region growing using semantics (IRGS), which is characterized by two aspects. First, it uses graduated increased edge penalty (GIEP) functions within the traditional Markov random field (MRF) context model in formulating the objective functions. Second, IRGS uses a region growing technique in searching for the solutions to these objective functions. The proposed IRGS is an improvement over traditional MRF based approaches in that the edge strength information is utilized and a more stable estimation of model parameters is achieved. Moreover, the IRGS method provides the possibility of building a hierarchical representation of the image content, and allows various region features and even domain knowledge to be incorporated in the segmentation process. The algorithm has been successfully tested on several artificial images and synthetic aperture radar (SAR) images.
TL;DR: In this article, the authors demonstrate that Factor-Cluster segmentation is not generally the best procedure to identify homogeneous groups of individuals (market segments) in the tourism industry.
Abstract: The concept of market segmentation has been widely accepted and warmly embraced both by tourism industry and academia. In tourism research, this increased interest in segmentation studies has led to the emergence of a standard research approach. Most notably a concept referred to as “factor–cluster segmentation” has been broadly adopted. The aim of this article is to demonstrate that this approach is not generally the best procedure to identify homogeneous groups of individuals (market segments).