TL;DR: The methods and software engineering philosophy behind this new tool, ITK-SNAP, are described and the results of validation experiments performed in the context of an ongoing child autism neuroimaging study are provided, finding that SNAP is a highly reliable and efficient alternative to manual tracing.
TL;DR: In this paper, a method for automated segmentation of the vasculature in retinal images is presented, which produces segmentations by classifying each image pixel as vessel or non-vessel, based on the pixel's feature vector.
Abstract: We present a method for automated segmentation of the vasculature in retinal images. The method produces segmentations by classifying each image pixel as vessel or nonvessel, based on the pixel's feature vector. Feature vectors are composed of the pixel's intensity and two-dimensional Gabor wavelet transform responses taken at multiple scales. The Gabor wavelet is capable of tuning to specific frequencies, thus allowing noise filtering and vessel enhancement in a single step. We use a Bayesian classifier with class-conditional probability density functions (likelihoods) described as Gaussian mixtures, yielding a fast classification, while being able to model complex decision surfaces. The probability distributions are estimated based on a training set of labeled pixels obtained from manual segmentations. The method's performance is evaluated on publicly available DRIVE (Staal et al.,2004) and STARE (Hoover et al.,2000) databases of manually labeled images. On the DRIVE database, it achieves an area under the receiver operating characteristic curve of 0.9614, being slightly superior than that presented by state-of-the-art approaches. We are making our implementation available as open source MATLAB scripts for researchers interested in implementation details, evaluation, or development of methods
TL;DR: A new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently, is proposed, which is used for automatic visual recognition and semantic segmentation of photographs.
Abstract: This paper proposes a new approach to learning a discriminative model of object classes, incorporating appearance, shape and context information efficiently. The learned model is used for automatic visual recognition and semantic segmentation of photographs. Our discriminative model exploits novel features, based on textons, which jointly model shape and texture. Unary classification and feature selection is achieved using shared boosting to give an efficient classifier which can be applied to a large number of classes. Accurate image segmentation is achieved by incorporating these classifiers in a conditional random field. Efficient training of the model on very large datasets is achieved by exploiting both random feature selection and piecewise training methods.
High classification and segmentation accuracy are demonstrated on three different databases: i) our own 21-object class database of photographs of real objects viewed under general lighting conditions, poses and viewpoints, ii) the 7-class Corel subset and iii) the 7-class Sowerby database used in [1]. The proposed algorithm gives competitive results both for highly textured (e.g. grass, trees), highly structured (e.g. cars, faces, bikes, aeroplanes) and articulated objects (e.g. body, cow).
TL;DR: This paper reviews ultrasound segmentation methods, in a broad sense, focusing on techniques developed for medical B-mode ultrasound images, and presents a classification of methodology in terms of use of prior information.
Abstract: This paper reviews ultrasound segmentation methods, in a broad sense, focusing on techniques developed for medical B-mode ultrasound images. First, we present a review of articles by clinical application to highlight the approaches that have been investigated and degree of validation that has been done in different clinical domains. Then, we present a classification of methodology in terms of use of prior information. We conclude by selecting ten papers which have presented original ideas that have demonstrated particular clinical usefulness or potential specific to the ultrasound segmentation problem
TL;DR: It is shown how certain nonconvex optimization problems that arise in image processing and computer vision can be restated as convex minimization problems, which allows, in particular, the finding of global minimizers via standard conveX minimization schemes.
Abstract: We show how certain nonconvex optimization problems that arise in image processing and computer vision can be restated as convex minimization problems. This allows, in particular, the finding of global minimizers via standard convex minimization schemes.
TL;DR: A practicable procedure is demonstrated that exceeds the accuracy of previous automatic methods and can compete with manual delineations in segmentation propagation and decision fusion of fused brain segmentations.
TL;DR: Common overlap measures are generalized to measure the total overlap of ensembles of labels defined on multiple test images and account for fractional labels using fuzzy set theory to allow a single "figure-of-merit" to be reported which summarises the results of a complex experiment by image pair, by label or overall.
Abstract: Measures of overlap of labelled regions of images, such as the Dice and Tanimoto coefficients, have been extensively used to evaluate image registration and segmentation algorithms. Modern studies can include multiple labels defined on multiple images yet most evaluation schemes report one overlap per labelled region, simply averaged over multiple images. In this paper, common overlap measures are generalized to measure the total overlap of ensembles of labels defined on multiple test images and account for fractional labels using fuzzy set theory. This framework allows a single "figure-of-merit" to be reported which summarises the results of a complex experiment by image pair, by label or overall. A complementary measure of error, the overlap distance, is defined which captures the spatial extent of the nonoverlapping part and is related to the Hausdorff distance computed on grey level images. The generalized overlap measures are validated on synthetic images for which the overlap can be computed analytically and used as similarity measures in nonrigid registration of three-dimensional magnetic resonance imaging (MRI) brain images. Finally, a pragmatic segmentation ground truth is constructed by registering a magnetic resonance atlas brain to 20 individual scans, and used with the overlap measures to evaluate publicly available brain segmentation algorithms
TL;DR: An optimal surface detection method capable of simultaneously detecting multiple interacting surfaces, in which the optimality is controlled by the cost functions designed for individual surfaces and by several geometric constraints defining the surface smoothness and interrelations is developed.
Abstract: Efficient segmentation of globally optimal surfaces representing object boundaries in volumetric data sets is important and challenging in many medical image analysis applications. We have developed an optimal surface detection method capable of simultaneously detecting multiple interacting surfaces, in which the optimality is controlled by the cost functions designed for individual surfaces and by several geometric constraints defining the surface smoothness and interrelations. The method solves the surface segmentation problem by transforming it into computing a minimum s{\hbox{-}} t cut in a derived arc-weighted directed graph. The proposed algorithm has a low-order polynomial time complexity and is computationally efficient. It has been extensively validated on more than 300 computer-synthetic volumetric images, 72 CT-scanned data sets of different-sized plexiglas tubes, and tens of medical images spanning various imaging modalities. In all cases, the approach yielded highly accurate results. Our approach can be readily extended to higher-dimensional image segmentation.
TL;DR: A novel 3D model-based algorithm is presented which performs viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions automatically and efficiently and is superior in terms of recognition rate and efficiency.
Abstract: Viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions is a challenging task. We present a novel 3D model-based algorithm which performs this task automatically and efficiently. A 3D model of an object is automatically constructed offline from its multiple unordered range images (views). These views are converted into multidimensional table representations (which we refer to as tensors). Correspondences are automatically established between these views by simultaneously matching the tensors of a view with those of the remaining views using a hash table-based voting scheme. This results in a graph of relative transformations used to register the views before they are integrated into a seamless 3D model. These models and their tensor representations constitute the model library. During online recognition, a tensor from the scene is simultaneously matched with those in the library by casting votes. Similarity measures are calculated for the model tensors which receive the most votes. The model with the highest similarity is transformed to the scene and, if it aligns accurately with an object in the scene, that object is declared as recognized and is segmented. This process is repeated until the scene is completely segmented. Experiments were performed on real and synthetic data comprised of 55 models and 610 scenes and an overall recognition rate of 95 percent was achieved. Comparison with the spin images revealed that our algorithm is superior in terms of recognition rate and efficiency
TL;DR: It is demonstrated that the standard active appearance model scheme performs poorly, but large improvements can be obtained by including areas outside the objects into the model, and a parameter optimization for active shape models is presented.
TL;DR: Guimond et al. as mentioned in this paper used a symmetric average model in which both hemispheres are equally represented and thus left-right comparison is possible. But, the asymmetry of the human cortex requires that both left and right models of a structure be composed in order to effectively segment the desired structures.
Abstract: In model-based segmentation, automated region identification is achieved via registration of novel data to a pre-determined model The desired structure is typically generated via manual tracing within this model When model-based segmentation is applied to human cortical data, problems arise if left-right comparisons are desired The asymmetry of the human cortex requires that both left and right models of a structure be composed in order to effectively segment the desired structures Paradoxically, defining a model in both hemi-spheres carries a likelihood of introducing bias to one of the structures This paper describes a novel technique for creating a symmetric average model in which both hemispheres are equally represented and thus left-right comparison is possible This work is an extension of that proposed by Guimond et al [1] Hippocampal segmentation is used as a test-case in a cohort of 118 normal eld-erly subjects and results are compared with expert manual tracing
TL;DR: In this paper, an automated classification system of landform elements based on object-oriented image analysis is presented, which can be used for almost any application where relationships between topographic features and other components of landscapes are to be assessed.
TL;DR: This work addresses the drawbacks of the conventional watershed algorithm when it is applied to medical images by using k-means clustering to produce a primary segmentation of the image before the authors apply their improved watershed segmentation algorithm to it.
Abstract: We propose a methodology that incorporates k-means and improved watershed segmentation algorithm for medical image segmentation. The use of the conventional watershed algorithm for medical image analysis is widespread because of its advantages, such as always being able to produce a complete division of the image. However, its drawbacks include over-segmentation and sensitivity to false edges. We address the drawbacks of the conventional watershed algorithm when it is applied to medical images by using k-means clustering to produce a primary segmentation of the image before we apply our improved watershed segmentation algorithm to it. The k-means clustering is an unsupervised learning algorithm, while the improved watershed segmentation algorithm makes use of automated thresholding on the gradient magnitude map and post-segmentation merging on the initial partitions to reduce the number of false edges and over-segmentation. By comparing the number of partitions in the segmentation maps of 50 images, we showed that our proposed methodology produced segmentation maps which have 92% fewer partitions than the segmentation maps produced by the conventional watershed algorithm
TL;DR: A framework for evaluating image segmentation algorithms based on three factors-precision, accuracy, and efficiency-need to be considered for both recognition and delineation is described.
TL;DR: A novel marker-controlled watershed based on mathematical morphology is proposed, which can effectively segment clustered cells with less oversegmentation and design a tracking method based on modified mean shift algorithm, in which several kernels with adaptive scale, shape, and direction are designed.
Abstract: It is important to observe and study cancer cells' cycle progression in order to better understand drug effects on cancer cells. Time-lapse microscopy imaging serves as an important method to measure the cycle progression of individual cells in a large population. Since manual analysis is unreasonably time consuming for the large volumes of time-lapse image data, automated image analysis is proposed. Existing approaches dealing with time-lapse image data are rather limited and often give inaccurate analysis results, especially in segmenting and tracking individual cells in a cell population. In this paper, we present a new approach to segment and track cell nuclei in time-lapse fluorescence image sequence. First, we propose a novel marker-controlled watershed based on mathematical morphology, which can effectively segment clustered cells with less oversegmentation. To further segment undersegmented cells or to merge oversegmented cells, context information among neighboring frames is employed, which is proved to be an effective strategy. Then, we design a tracking method based on modified mean shift algorithm, in which several kernels with adaptive scale, shape, and direction are designed. Finally, we combine mean-shift and Kalman filter to achieve a more robust cell nuclei tracking method than existing ones. Experimental results show that our method can obtain 98.8% segmentation accuracy, 97.4% cell division tracking accuracy, and 97.6% cell tracking accuracy
TL;DR: This paper proposes shape dissimilarity measures on the space of level set functions which are analytically invariant under the action of certain transformation groups, and proposes a statistical shape prior which allows to accurately encode multiple fairly distinct training shapes.
Abstract: In this paper, we make two contributions to the field of level set based image segmentation. Firstly, we propose shape dissimilarity measures on the space of level set functions which are analytically invariant under the action of certain transformation groups. The invariance is obtained by an intrinsic registration of the evolving level set function. In contrast to existing approaches to invariance in the level set framework, this closed-form solution removes the need to iteratively optimize explicit pose parameters. The resulting shape gradient is more accurate in that it takes into account the effect of boundary variation on the object's pose.
Secondly, based on these invariant shape dissimilarity measures, we propose a statistical shape prior which allows to accurately encode multiple fairly distinct training shapes. This prior constitutes an extension of kernel density estimators to the level set domain. In contrast to the commonly employed Gaussian distribution, such nonparametric density estimators are suited to model aribtrary distributions.
We demonstrate the advantages of this multi-modal shape prior applied to the segmentation and tracking of a partially occluded walking person in a video sequence, and on the segmentation of the left ventricle in cardiac ultrasound images. We give quantitative results on segmentation accuracy and on the dependency of segmentation results on the number of training shapes.
TL;DR: In this paper, a multi-view approach is presented to track people in crowded scenes where people may be partially or completely occluding each other, by using multiple views in synergy so that information from all views is combined to detect objects.
Abstract: Occlusion and lack of visibility in dense crowded scenes make it very difficult to track individual people correctly and consistently. This problem is particularly hard to tackle in single camera systems. We present a multi-view approach to tracking people in crowded scenes where people may be partially or completely occluding each other. Our approach is to use multiple views in synergy so that information from all views is combined to detect objects. To achieve this we present a novel planar homography constraint to resolve occlusions and robustly determine locations on the ground plane corresponding to the feet of the people. To find tracks we obtain feet regions over a window of frames and stack them creating a space time volume. Feet regions belonging to the same person form contiguous spatio-temporal regions that are clustered using a graph cuts segmentation approach. Each cluster is the track of a person and a slice in time of this cluster gives the tracked location. Experimental results are shown in scenes of dense crowds where severe occlusions are quite common. The algorithm is able to accurately track people in all views maintaining correct correspondences across views. Our algorithm is ideally suited for conditions when occlusions between people would seriously hamper tracking performance or if there simply are not enough features to distinguish between different people.
TL;DR: The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality, and a representative single performance value is computed from the graphs.
Abstract: Evaluation of object detection algorithms is a non-trivial task: a detection result is usually evaluated by comparing the bounding box of the detected object with the bounding box of the ground truth object. The commonly used precision and recall measures are computed from the overlap area of these two rectangles. However, these measures have several drawbacks: they don't give intuitive information about the proportion of the correctly detected objects and the number of false alarms, and they cannot be accumulated across multiple images without creating ambiguity in their interpretation. Furthermore, quantitative and qualitative evaluation is often mixed resulting in ambiguous measures.
In this paper we propose a new approach which tackles these problems. The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality. In order to compare different detection algorithms, a representative single performance value is computed from the graphs. The influence of the test database on the detection performance is illustrated by performance/generality graphs. The evaluation method can be applied to different types of object detection algorithms. It has been tested on different text detection algorithms, among which are the participants of the ICDAR 2003 text detection competition.
TL;DR: A fuzzy c-means (FCM) clustering-based method for the segmentation of breast lesions in three dimensions from contrast-enhanced MR images was shown to be effective and efficient.
TL;DR: This paper describes a planar reflective symmetry transform (PRST) that captures a continuous measure of the reflectional symmetry of a shape with respect to all possible planes and uses the transform to define two new geometric properties, center of symmetry and principal symmetry axes.
Abstract: Symmetry is an important cue for many applications, including object alignment, recognition, and segmentation. In this paper, we describe a planar reflective symmetry transform (PRST) that captures a continuous measure of the reflectional symmetry of a shape with respect to all possible planes. This transform combines and extends previous work that has focused on global symmetries with respect to the center of mass in 3D meshes and local symmetries with respect to points in 2D images. We provide an efficient Monte Carlo sampling algorithm for computing the transform for surfaces and show that it is stable under common transformations. We also provide an iterative refinement algorithm to find local maxima of the transform precisely. We use the transform to define two new geometric properties, center of symmetry and principal symmetry axes, and show that they are useful for aligning objects in a canonical coordinate system. Finally, we demonstrate that the symmetry transform is useful for several applications in computer graphics, including shape matching, segmentation of meshes into parts, and automatic viewpoint selection.
TL;DR: The overall architecture of the perception system is presented, some of the implemented cooperative perception techniques are described, and experimental results on automatic forest fire detection and localization with cooperating UAVs are shown.
TL;DR: This paper describes a novel technique for creating a symmetric average model in which both hemispheres are equally represented and thus left-right comparison is possible.
Abstract: In model-based segmentation, automated region identification is achieved via registration of novel data to a pre-determined model. The desired structure is typically generated via manual tracing within this model. When model-based segmentation is applied to human cortical data, problems arise if left-right comparisons are desired. The asymmetry of the human cortex requires that both left and right models of a structure be composed in order to effectively segment the desired structures. Paradoxically, defining a model in both hemi-spheres carries a likelihood of introducing bias to one of the structures. This paper describes a novel technique for creating a symmetric average model in which both hemispheres are equally represented and thus left-right comparison is possible. This work is an extension of that proposed by Guimond et al [1]. Hippocampal segmentation is used as a test-case in a cohort of 118 normal eld-erly subjects and results are compared with expert manual tracing.
TL;DR: This paper presents an original hidden Markov model (HMM) approach for online beat segmentation and classification of electrocardiograms, and the results obtained validate the approach for real world application.
Abstract: This paper presents an original hidden Markov model (HMM) approach for online beat segmentation and classification of electrocardiograms. The HMM framework has been visited because of its ability of beat detection, segmentation and classification, highly suitable to the electrocardiogram (ECG) problem. Our approach addresses a large panel of topics some of them never studied before in other HMM related works: waveforms modeling, multichannel beat segmentation and classification, and unsupervised adaptation to the patient's ECG. The performance was evaluated on the two-channel QT database in terms of waveform segmentation precision, beat detection and classification. Our waveform segmentation results compare favorably to other systems in the literature. We also obtained high beat detection performance with sensitivity of 99.79% and a positive predictivity of 99.96%, using a test set of 59 recordings. Moreover, premature ventricular contraction beats were detected using an original classification strategy. The results obtained validate our approach for real world application
TL;DR: A semi-automatic technique for modeling plants directly from images, automating the process of shape recovery while relying on the user to provide simple hints on segmentation, which inherits the realistic shape and complexity of a real plant.
Abstract: In this paper, we propose a semi-automatic technique for modeling plants directly from images. Our image-based approach has the distinct advantage that the resulting model inherits the realistic shape and complexity of a real plant. We designed our modeling system to be interactive, automating the process of shape recovery while relying on the user to provide simple hints on segmentation. Segmentation is performed in both image and 3D spaces, allowing the user to easily visualize its effect immediately. Using the segmented image and 3D data, the geometry of each leaf is then automatically recovered from the multiple views by fitting a deformable leaf model. Our system also allows the user to easily reconstruct branches in a similar manner. We show realistic reconstructions of a variety of plants, and demonstrate examples of plant editing.
TL;DR: Novel methods to evaluate the performance of object detection algorithms in video sequences are proposed and segmentation algorithms recently proposed are evaluated in order to assess how well they can detect moving regions in an outdoor scene in fixed-camera situations.
Abstract: In this paper, we propose novel methods to evaluate the performance of object detection algorithms in video sequences. This procedure allows us to highlight characteristics (e.g., region splitting or merging) which are specific of the method being used. The proposed framework compares the output of the algorithm with the ground truth and measures the differences according to objective metrics. In this way it is possible to perform a fair comparison among different methods, evaluating their strengths and weaknesses and allowing the user to perform a reliable choice of the best method for a specific application. We apply this methodology to segmentation algorithms recently proposed and describe their performance. These methods were evaluated in order to assess how well they can detect moving regions in an outdoor scene in fixed-camera situations
TL;DR: An automated algorithm for tissue segmentation of noisy, low-contrast magnetic resonance (MR) images of the brain is presented and the applicability of the framework can be extended to diseased brains and neonatal brains.
Abstract: An automated algorithm for tissue segmentation of noisy, low-contrast magnetic resonance (MR) images of the brain is presented. A mixture model composed of a large number of Gaussians is used to represent the brain image. Each tissue is represented by a large number of Gaussian components to capture the complex tissue spatial layout. The intensity of a tissue is considered a global feature and is incorporated into the model through tying of all the related Gaussian parameters. The expectation-maximization (EM) algorithm is utilized to learn the parameter-tied, constrained Gaussian mixture model. An elaborate initialization scheme is suggested to link the set of Gaussians per tissue type, such that each Gaussian in the set has similar intensity characteristics with minimal overlapping spatial supports. Segmentation of the brain image is achieved by the affiliation of each voxel to the component of the model that maximized the a posteriori probability. The presented algorithm is used to segment three-dimensional, T1-weighted, simulated and real MR images of the brain into three different tissues, under varying noise conditions. Results are compared with state-of-the-art algorithms in the literature. The algorithm does not use an atlas for initialization or parameter learning. Registration processes are therefore not required and the applicability of the framework can be extended to diseased brains and neonatal brains
TL;DR: In this article, an efficient motion vs non-motion classifier is trained to operate directly and jointly on intensity-change and contrast, and its output is then fused with colour information.
Abstract: This paper presents an algorithm capable of real-time separation of foreground from background in monocular video sequences. Automatic segmentation of layers from colour/contrast or from motion alone is known to be error-prone. Here motion, colour and contrast cues are probabilistically fused together with spatial and temporal priors to infer layers accurately and efficiently. Central to our algorithm is the fact that pixel velocities are not needed, thus removing the need for optical flow estimation, with its tendency to error and computational expense. Instead, an efficient motion vs nonmotion classifier is trained to operate directly and jointly on intensity-change and contrast. Its output is then fused with colour information. The prior on segmentation is represented by a second order, temporal, Hidden Markov Model, together with a spatial MRF favouring coherence except where contrast is high. Finally, accurate layer segmentation and explicit occlusion detection are efficiently achieved by binary graph cut. The segmentation accuracy of the proposed algorithm is quantitatively evaluated with respect to existing groundtruth data and found to be comparable to the accuracy of a state of the art stereo segmentation algorithm. Foreground/ background segmentation is demonstrated in the application of live background substitution and shown to generate convincingly good quality composite video.
TL;DR: This paper presents a low-level system for boundary extraction and segmentation of natural images and the evaluation of its performance proves that this system outperforms significantly two widely used hierarchical segmentation techniques, as well as the state of the art in local edge detection.
Abstract: This paper presents a low-level system for boundary extraction and segmentation of natural images and the evaluation of its performance. We study the problem in the framework of hierarchical classification, where the geometric structure of an image can be represented by an ultrametric contour map, the soft boundary image associated to a family of nested segmentations. We define generic ultrametric distances by integrating local contour cues along the regions boundaries and combining this information with region attributes. Then, we evaluate quantitatively our results with respect to ground-truth segmentation data, proving that our system outperforms significantly two widely used hierarchical segmentation techniques, as well as the state of the art in local edge detection.
TL;DR: In this article, the segmentation of airborne laser scanning data is based on cluster analysis in a feature space, and a recently proposed neighborhood system, called slope adaptive, is utilized to improve the quality of the computed attributes.
Abstract: This paper presents an algorithm for the segmentation of airborne laser scanning data. The segmentation is based on cluster analysis in a feature space. To improve the quality of the computed attributes, a recently proposed neighborhood system, called slope adaptive, is utilized. Key parameters of the laser data, e.g., point density, measurement accuracy, and horizontal and vertical point distribution, are used for defining the neighborhood among the measured points. Accounting for these parameters facilitates the computation of accurate and reliable attributes for the segmentation irrespective of point density and the 3D content of the data (step edges, layered surfaces, etc.) The segmentation with these attributes reveals more of the information that exists in the airborne laser scanning data.
TL;DR: This work formalizes segmentation as a graph-partitioning task that optimizes the normalized cut criterion and demonstrates that global analysis improves the segmentation accuracy and is robust in the presence of speech recognition errors.
Abstract: We consider the task of unsupervised lecture segmentation. We formalize segmentation as a graph-partitioning task that optimizes the normalized cut criterion. Our approach moves beyond localized comparisons and takes into account long-range cohesion dependencies. Our results demonstrate that global analysis improves the segmentation accuracy and is robust in the presence of speech recognition errors.