TL;DR: A survey of a specific class of region-based level set segmentation methods and how they can all be derived from a common statistical framework is presented.
Abstract: Since their introduction as a means of front propagation and their first application to edge-based segmentation in the early 90's, level set methods have become increasingly popular as a general framework for image segmentation. In this paper, we present a survey of a specific class of region-based level set segmentation methods and clarify how they can all be derived from a common statistical framework.
Region-based segmentation schemes aim at partitioning the image domain by progressively fitting statistical models to the intensity, color, texture or motion in each of a set of regions. In contrast to edge-based schemes such as the classical Snakes, region-based methods tend to be less sensitive to noise. For typical images, the respective cost functionals tend to have less local minima which makes them particularly well-suited for local optimization methods such as the level set method.
We detail a general statistical formulation for level set segmentation. Subsequently, we clarify how the integration of various low level criteria leads to a set of cost functionals. We point out relations between the different segmentation schemes. In experimental results, we demonstrate how the level set function is driven to partition the image plane into domains of coherent color, texture, dynamic texture or motion. Moreover, the Bayesian formulation allows to introduce prior shape knowledge into the level set method. We briefly review a number of advances in this domain.
TL;DR: By incorporating local spatial and gray information together, a novel fast and robust FCM framework for image segmentation, i.e., fast generalized fuzzy c-means (FGFCM) clustering algorithms, is proposed and can mitigate the disadvantages of FCM_S and at the same time enhances the clustering performance.
TL;DR: This paper proposes to unify three well-known image variational models, namely the snake model, the Rudin–Osher–Fatemi denoising model and the Mumford–Shah segmentation model, and establishes theorems with proofs to determine a global minimum of the active contour model.
Abstract: The active contour/snake model is one of the most successful variational models in image segmentation. It consists of evolving a contour in images toward the boundaries of objects. Its success is based on strong mathematical properties and efficient numerical schemes based on the level set method. The only drawback of this model is the existence of local minima in the active contour energy, which makes the initial guess critical to get satisfactory results. In this paper, we propose to solve this problem by determining a global minimum of the active contour model. Our approach is based on the unification of image segmentation and image denoising tasks into a global minimization framework. More precisely, we propose to unify three well-known image variational models, namely the snake model, the Rudin---Osher---Fatemi denoising model and the Mumford---Shah segmentation model. We will establish theorems with proofs to determine the existence of a global minimum of the active contour model. From a numerical point of view, we propose a new practical way to solve the active contour propagation problem toward object boundaries through a dual formulation of the minimization problem. The dual formulation, easy to implement, allows us a fast global minimization of the snake energy. It avoids the usual drawback in the level set approach that consists of initializing the active contour in a distance function and re-initializing it periodically during the evolution, which is time-consuming. We apply our segmentation algorithms on synthetic and real-world images, such as texture images and medical images, to emphasize the performances of our model compared with other segmentation models.
TL;DR: In the framework of computer-aided diagnosis of eye diseases, retinal vessel segmentation based on line operators is proposed and two segmentation methods are considered.
Abstract: In the framework of computer-aided diagnosis of eye diseases, retinal vessel segmentation based on line operators is proposed. A line detector, previously used in mammography, is applied to the green channel of the retinal image. It is based on the evaluation of the average grey level along lines of fixed length passing through the target pixel at different orientations. Two segmentation methods are considered. The first uses the basic line detector whose response is thresholded to obtain unsupervised pixel classification. As a further development, we employ two orthogonal line detectors along with the grey level of the target pixel to construct a feature vector for supervised classification using a support vector machine. The effectiveness of both methods is demonstrated through receiver operating characteristic analysis on two publicly available databases of color fundus images.
TL;DR: It is demonstrated how a recently proposed measure of similarity, the normalized probabilistic rand (NPR) index, can be used to perform a quantitative comparison between image segmentation algorithms using a hand-labeled set of ground-truth segmentations.
Abstract: Unsupervised image segmentation is an important component in many image understanding algorithms and practical vision systems. However, evaluation of segmentation algorithms thus far has been largely subjective, leaving a system designer to judge the effectiveness of a technique based only on intuition and results in the form of a few example segmented images. This is largely due to image segmentation being an ill-defined problem-there is no unique ground-truth segmentation of an image against which the output of an algorithm may be compared. This paper demonstrates how a recently proposed measure of similarity, the normalized probabilistic rand (NPR) index, can be used to perform a quantitative comparison between image segmentation algorithms using a hand-labeled set of ground-truth segmentations. We show that the measure allows principled comparisons between segmentations created by different algorithms, as well as segmentations on different images. We outline a procedure for algorithm evaluation through an example evaluation of some familiar algorithms - the mean-shift-based algorithm, an efficient graph-based segmentation algorithm, a hybrid algorithm that combines the strengths of both methods, and expectation maximization. Results are presented on the 300 images in the publicly available Berkeley segmentation data set
TL;DR: A parameter free approach that utilizes multiple cues for image segmentation that takes into account intensity and texture distributions in a local area around each region and incorporates priors based on the geometry of the regions.
Abstract: We present a parameter free approach that utilizes multiple cues for image segmentation. Beginning with an image, we execute a sequence of bottom-up aggregation steps in which pixels are gradually merged to produce larger and larger regions. In each step we consider pairs of adjacent regions and provide a probability measure to assess whether or not they should be included in the same segment. Our probabilistic formulation takes into account intensity and texture distributions in a local area around each region. It further incorporates priors based on the geometry of the regions. Finally, posteriors based on intensity and texture cues are combined using a mixture of experts formulation. This probabilistic approach is integrated into a graph coarsening scheme providing a complete hierarchical segmentation of the image. The algorithm complexity is linear in the number of the image pixels and it requires almost no user-tuned parameters. We test our method on a variety of gray scale images and compare our results to several existing segmentation algorithms.
TL;DR: The dual threshold method offers a robust and fully-automated alternative to the gold standard that can efficiently segment bone regions with accurate and repeatable results.
TL;DR: An interactive framework for soft segmentation and matting of natural images and videos is presented, based on the optimal, linear time, computation of weighted geodesic distances to the user-provided scribbles, from which the whole data is automatically segmented.
Abstract: An interactive framework for soft segmentation and matting of natural images and videos is presented in this paper. The proposed technique is based on the optimal, linear time, computation of weighted geodesic distances to the user-provided scribbles, from which the whole data is automatically segmented. The weights are based on spatial and/or temporal gradients, without explicit optical flow or any advanced and often computationally expensive feature detectors. These could be naturally added to the proposed framework as well if desired, in the form of weights in the geodesic distances. A localized refinement step follows this fast segmentation in order to accurately compute the corresponding matte function. Additional constraints into the distance definition permit to efficiently handle occlusions such as people or objects crossing each other in a video sequence. The presentation of the framework is complemented with numerous and diverse examples, including extraction of moving foreground from dynamic background, and comparisons with the recent literature.
TL;DR: It is shown that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data and can be readily applied to segment real imagery and bioinformatic data.
Abstract: In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate-distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm that depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.
TL;DR: The rationale for organizing the competition is discussed, the training and test data sets for both segmentation tasks are described and the scoring system used to evaluate the segmentation is presented.
Abstract: This paper describes the set-up of a segmentation competi- tion for automatic and semi-automatic extraction of the liver from com- puted tomography scans and the caudate nucleus from brain MRI data. This competition was held in the form of a workshop at the 2007 Medical Image Computing and Computer Assisted Intervention conference. The rationale for organizing the competition is discussed, the training and test data sets for both segmentation tasks are described and the scoring system used to evaluate the segmentation is presented.
TL;DR: Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequences of morphemes and is shown to perform very well compared to a widely known benchmark algorithm on Finnish data.
Abstract: We present a model family called Morfessor for the unsupervised induction of a simple morphology from raw text data. The model is formulated in a probabilistic maximum a posteriori framework. Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequences of morphemes. A lexicon of word segments, called morphs, is induced from the data. The lexicon stores information about both the usage and form of the morphs. Several instances of the model are evaluated quantitatively in a morpheme segmentation task on different sized sets of Finnish as well as English data. Morfessor is shown to perform very well compared to a widely known benchmark algorithm, in particular on Finnish data.
TL;DR: The multiple segmentation approach is used to evaluate how close can real segments approach the ground-truth for real objects, and at what cost.
Abstract: Sliding window scanning is the dominant paradigm in object recognition research today. But while much success has been reported in detecting several rectangular-shaped object classes (i.e. faces, cars, pedestrians), results have been much less impressive for more general types of objects. Several researchers have advocated the use of image segmentation as a way to get a better spatial support for objects. In this paper, our aim is to address this issue by studying the following two questions: 1) how important is good spatial support for recognition? 2) can segmentation provide better spatial support for objects? To answer the first, we compare recognition performance using ground-truth segmentation vs. bounding boxes. To answer the second, we use the multiple segmentation approach to evaluate how close can real segments approach the ground-truth for real objects, and at what cost. Our results demonstrate the importance of finding the right spatial support for objects, and the feasibility of doing so without excessive computational burden.
TL;DR: It can be concluded that a fully automated method using non-rigid registration may replace manual segmentation, and thus that automated brain tissue segmentation without laborious manual training is feasible.
TL;DR: This paper presents a joint formulation for a complex super-resolution problem in which the scenes contain multiple independently moving objects, built upon the maximum a posteriori (MAP) framework, which judiciously combines motion estimation, segmentation, and super resolution together.
Abstract: Super resolution image reconstruction allows the recovery of a high-resolution (HR) image from several low-resolution images that are noisy, blurred, and down sampled. In this paper, we present a joint formulation for a complex super-resolution problem in which the scenes contain multiple independently moving objects. This formulation is built upon the maximum a posteriori (MAP) framework, which judiciously combines motion estimation, segmentation, and super resolution together. A cyclic coordinate descent optimization procedure is used to solve the MAP formulation, in which the motion fields, segmentation fields, and HR images are found in an alternate manner given the two others, respectively. Specifically, the gradient-based methods are employed to solve the HR image and motion fields, and an iterated conditional mode optimization method to obtain the segmentation fields. The proposed algorithm has been tested using a synthetic image sequence, the "Mobile and Calendar" sequence, and the original "Motorcycle and Car" sequence. The experiment results and error analyses verify the efficacy of this algorithm
TL;DR: In this paper, a method for segmenting video data into foreground and background (324) portions utilizes statistical modeling of the pixels Λ statistical model of the background is built for each pixel, and each pixel in an incoming video frame is compared with the background statistical model for that pixel.
Abstract: A method for segmenting video data into foreground and background (324) portions utilizes statistical modeling of the pixels Λ statistical model of the background is built for each pixel, and each pixel in an incoming video frame is compared (326) with the background statistical model for that pixel. Pixels are determined to be foreground or background based on the comparisons. The method for segmenting video data may be further incorporated into a method for implementing an intelligent video surveillance system The method for segmenting video data may be implemented in hardware.
TL;DR: A recursive programming technique is presented which reduces an order of magnitude for computing the minimum cross entropy thresholding objective function and a particle swarm optimization (PSO) algorithm is proposed for searching the near-optimal MCET thresholds.
TL;DR: Experiments show that this simple framework is capable of achieving both high recall and high precision with only a few positive training examples and that this method can be generalized to many object classes.
Abstract: We develop an object detection method combining top-down recognition with bottom-up image segmentation. There are two main steps in this method: a hypothesis generation step and a verification step. In the top-down hypothesis generation step, we design an improved Shape Context feature, which is more robust to object deformation and background clutter. The improved Shape Context is used to generate a set of hypotheses of object locations and figure-ground masks, which have high recall and low precision rate. In the verification step, we first compute a set of feasible segmentations that are consistent with top-down object hypotheses, then we propose a False Positive Pruning (FPP) procedure to prune out false positives. We exploit the fact that false positive regions typically do not align with any feasible image segmentation. Experiments show that this simple framework is capable of achieving both high recall and high precision with only a few positive training examples and that this method can be generalized to many object classes.
TL;DR: An intensity renormalization procedure is introduced that automatically adjusts the prior atlas intensity model to new input data and reduces the sensitivity of the whole brain segmentation method to changes in scanner platforms and improves its accuracy and robustness.
Abstract: Atlas-based approaches have demonstrated the ability to automatically identify detailed brain structures from 3-D magnetic resonance (MR) brain images. Unfortunately, the accuracy of this type of method often degrades when processing data acquired on a different scanner platform or pulse sequence than the data used for the atlas training. In this paper, we improve the performance of an atlas-based whole brain segmentation method by introducing an intensity renormalization procedure that automatically adjusts the prior atlas intensity model to new input data. Validation using manually labeled test datasets has shown that the new procedure improves the segmentation accuracy (as measured by the Dice coefficient) by 10% or more for several structures including hippocampus, amygdala, caudate, and pallidum. The results verify that this new procedure reduces the sensitivity of the whole brain segmentation method to changes in scanner platforms and improves its accuracy and robustness, which can thus facilitate multicenter or multisite neuroanatomical imaging studies
TL;DR: In this paper, the authors proposed a hybrid technique that combines ROI feature detection, region segmentation, and background subtraction to segment a region-of-interest (ROI) video object from a video sequence.
Abstract: The disclosure is directed to techniques for automatic segmentation of a region-of-interest (ROI) video object from a video sequence. ROI object segmentation enables selected ROI or “foreground” objects of a video sequence that may be of interest to a viewer to be extracted from non-ROI or “background” areas of the video sequence. Examples of a ROI object are a human face or a head and shoulder area of a human body. The disclosed techniques include a hybrid technique that combines ROI feature detection, region segmentation, and background subtraction. In this way, the disclosed techniques may provide accurate foreground object generation and low-complexity extraction of the foreground object from the video sequence. A ROI object segmentation system may implement the techniques described herein. In addition, ROI object segmentation may be useful in a wide range of multimedia applications that utilize video sequences, such as video telephony applications and video surveillance applications.
TL;DR: This work proposes a region-oriented segmentation algorithm for detecting the most common peel defects of citrus fruits, focused on the detection of the regions of interest consisting of the sound peel, the stem and the defects.
TL;DR: In this article, a graph cuts based active contours (GCBAC) approach is proposed for object segmentation, which uses graph cuts to iteratively deform the contour and its cost function is defined as the summation of edge weights on the cut.
TL;DR: A novel support aggregation strategy which includes information obtained from a segmentation process is proposed which is effective in improving the state of the art in dense stereo correspondence.
Abstract: Significant achievements have been attained in the field of dense stereo correspondence by local algorithms based on an adaptive support Given the problem of matching two correspondent pixels within a local stereo process, the basic idea is to consider as support for each pixel only those points which lay on the same disparity plane, rather than those belonging to a fixed support
This paper proposes a novel support aggregation strategy which includes information obtained from a segmentation process Experimental results on the Middlebury dataset demonstrate that our approach is effective in improving the state of the art
TL;DR: This paper presents a road signs detection and classification system based on a three-step algorithm composed of color segmentation, shape recognition, and a neural network to achieve real time execution.
Abstract: This paper presents a road signs detection and classification system based on a three-step algorithm composed of color segmentation, shape recognition, and a neural network. The final goal of this algorithm is to detect and classify almost all road signs present along Italian roads. Color segmentation was suggested by the aim to achieve real time execution, since color-based segmentation is faster than the one based on shape. In order to save computational time, only the RGB color space, directly supplied by the chosen camera, or color spaces that can be obtained with linear transformations, are considered. Two different methods are used for shape detection, one is based on pattern matching with simple models and the other one is based on edge detection and geometrical cues. The complete set of signs taken in account has been divided in several categories according to their shape and color. Finally for each road signs set a neural network is built and trained.
TL;DR: A Bayesian approach to human detection and segmentation combining local part-based and global template-based schemes that relies on the key ideas of matching a part-template tree to images hierarchically to generate a reliable set of detection hypotheses and optimizing it under a Bayesian MAP framework.
Abstract: Local part-based human detectors are capable of handling partial occlusions efficiently and modeling shape articulations flexibly, while global shape template-based human detectors are capable of detecting and segmenting human shapes simultaneously. We describe a Bayesian approach to human detection and segmentation combining local part-based and global template-based schemes. The approach relies on the key ideas of matching a part-template tree to images hierarchically to generate a reliable set of detection hypotheses and optimizing it under a Bayesian MAP framework through global likelihood re-evaluation and fine occlusion analysis. In addition to detection, our approach is able to obtain human shapes and poses simultaneously. We applied the approach to human detection and segmentation in crowded scenes with and without background subtraction. Experimental results show that our approach achieves good performance on images and video sequences with severe occlusion.
TL;DR: A color-based segmentation method that uses the K-means clustering technique to track tumor objects in magnetic resonance (MR) brain images and can successfully achieve segmentation for MR brain images to help pathologists distinguish exactly lesion size and region.
Abstract: In this paper, we propose a color-based segmentation method that uses the K-means clustering technique to track tumor objects in magnetic resonance (MR) brain images. The key concept in this color-based segmentation algorithm with K-means is to convert a given gray-level MR image into a color space image and then separate the position of tumor objects from other items of an MR image by using K-means clustering and histogram-clustering. Experiments demonstrate that the method can successfully achieve segmentation for MR brain images to help pathologists distinguish exactly lesion size and region.
TL;DR: A framework of fuzzy information fusion is proposed in this paper to automatically segment tumor areas of human brain from multispectral magnetic resonance imaging (MRI) such as T1- Weighted, T2-weighted and proton density (PD) images.
TL;DR: An interactive algorithm for soft segmentation of natural images is presented, which first roughly scribbles different regions of interest, and from them, the whole image is automatically segmented via fast, linear complexity computation of weighted distances to the user-provided scribbles.
Abstract: An interactive algorithm for soft segmentation of natural images is presented in this paper. The user first roughly scribbles different regions of interest, and from them, the whole image is automatically segmented. This soft segmentation is obtained via fast, linear complexity computation of weighted distances to the user-provided scribbles. The adaptive weights are obtained from a series of Gabor filters, and are automatically computed according to the ability of each single filter to discriminate between the selected regions of interest. We present the underlying framework and examples showing the capability of the algorithm to segment diverse images
TL;DR: The Hough transform technique provides a reliable way to segment ultrasound images of the carotid artery and can be used in clinical practice to estimate indices of arterial wall physiology, such as the IMT and the ADW.
Abstract: Automatic segmentation of the arterial lumen from ultrasound images is an important task in clinical diagnosis. In this paper, the Hough transform (HT) was used to automatically extract straight lines and circles from sequences of B-mode ultrasound images of longitudinal and transverse sections, respectively, of the carotid artery. In 10 normal subjects, the specificity and accuracy of HT-based segmentation were on average higher than 0.96 for both sections, whereas the sensitivity was higher than 0.96 in longitudinal and higher than 0.82 in transverse sections. The intima-media thickness (IMT) was also estimated from images of longitudinal sections; the corresponding validation parameters were generally higher than 0.90. To further validate the results, arterial distension waveforms (ADW) were estimated from sequences of images using the HT technique as well as motion analysis using block matching (BM). In longitudinal sections, diastolic and systolic diameters and relative diameter changes using HT and BM were not significantly different. In transverse sections, diastolic and systolic diameters were significantly lower using the HT technique; the differences were <7%. Relative diameter changes in transverse sections were not significantly different from BM-estimated ones. The HT technique was also applied to four subjects with atherosclerosis, in which sensitivity, specificity and accuracy were comparable to those of normal subjects; the low values of sensitivity in transverse sections may reflect departure from the circular model because of the presence of plaque. In conclusion, the HT technique provides a reliable way to segment ultrasound images of the carotid artery and can be used in clinical practice to estimate indices of arterial wall physiology, such as the IMT and the ADW.
TL;DR: A method for vessel segmentation and tracking in ultrasound images using Kalman filters is presented, and results indicate that mean errors between segmented contours and expert tracings are on the order of 1%-2% of the maximum feature dimension.
Abstract: A method for vessel segmentation and tracking in ultrasound images using Kalman filters is presented. A modified Star-Kalman algorithm is used to determine vessel contours and ellipse parameters using an extended Kalman filter with an elliptical model. The parameters can be used to easily calculate the transverse vessel area which is of clinical use. A temporal Kalman filter is used for tracking the vessel center over several frames, using location measurements from a handheld sensorized ultrasound probe. The segmentation and tracking have been implemented in real-time and validated using simulated ultrasound data with known features and real data, for which expert segmentation was performed. Results indicate that mean errors between segmented contours and expert tracings are on the order of 1%-2% of the maximum feature dimension, and that the transverse cross-sectional vessel area as computed from estimated ellipse parameters a, b as determined by our algorithm is within 10% of that determined by experts. The location of the vessel center was tracked accurately for a range of speeds from 1.4 to 11.2 mm/s.
TL;DR: The proposed algorithm is able to segment closely juxtaposed or touching cell nuclei obtained from 3D microscopy imaging with reasonable accuracy.
Abstract: Reliable segmentation of cell nuclei from three dimensional (3D) microscopic images is an important task in many biological studies. We present a novel, fully automated method for the segmentation of cell nuclei from 3D microscopic images. It was designed specifically to segment nuclei in images where the nuclei are closely juxtaposed or touching each other. The segmentation approach has three stages: 1) a gradient diffusion procedure, 2) gradient flow tracking and grouping, and 3) local adaptive thresholding. Both qualitative and quantitative results on synthesized and original 3D images are provided to demonstrate the performance and generality of the proposed method. Both the over-segmentation and under-segmentation percentages of the proposed method are around 5%. The volume overlap, compared to expert manual segmentation, is consistently over 90%. The proposed algorithm is able to segment closely juxtaposed or touching cell nuclei obtained from 3D microscopy imaging with reasonable accuracy.