TL;DR: In this paper, the authors present a database containing ground truth segmentations produced by humans for images of a wide variety of natural scenes, and define an error measure which quantifies the consistency between segmentations of differing granularities.
Abstract: This paper presents a database containing 'ground truth' segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the same image are highly consistent. Use of this dataset is demonstrated in two applications: (1) evaluating the performance of segmentation algorithms and (2) measuring probability distributions associated with Gestalt grouping factors as well as statistics of image region properties.
TL;DR: The authors propose a novel hidden Markov random field (HMRF) model, which is a stochastic process generated by a MRF whose state sequence cannot be observed directly but which can be indirectly estimated through observations.
Abstract: The finite mixture (FM) model is the most commonly used model for statistical segmentation of brain magnetic resonance (MR) images because of its simple mathematical form and the piecewise constant nature of ideal brain MR images. However, being a histogram-based model, the FM has an intrinsic limitation-no spatial information is taken into account. This causes the FM model to work only on well-defined images with low levels of noise; unfortunately, this is often not the the case due to artifacts such as partial volume effect and bias field distortion. Under these conditions, FM model-based methods produce unreliable results. Here, the authors propose a novel hidden Markov random field (HMRF) model, which is a stochastic process generated by a MRF whose state sequence cannot be observed directly but which can be indirectly estimated through observations. Mathematically, it can be shown that the FM model is a degenerate version of the HMRF model. The advantage of the HMRF model derives from the way in which the spatial information is encoded through the mutual influences of neighboring sites. Although MRF modeling has been employed in MR image segmentation by other researchers, most reported methods are limited to using MRF as a general prior in an FM model-based approach. To fit the HMRF model, an EM algorithm is used. The authors show that by incorporating both the HMRF model and the EM algorithm into a HMRF-EM framework, an accurate and robust segmentation can be achieved. More importantly, the HMRF-EM framework can easily be combined with other techniques. As an example, the authors show how the bias field correction algorithm of Guillemaud and Brady (1997) can be incorporated into this framework to achieve a three-dimensional fully automated approach for brain MR image segmentation.
TL;DR: In this paper, the user marks certain pixels as "object" or "background" to provide hard constraints for segmentation, and additional soft constraints incorporate both boundary and region information.
Abstract: In this paper we describe a new technique for general purpose interactive segmentation of N-dimensional images. The user marks certain pixels as "object" or "background" to provide hard constraints for segmentation. Additional soft constraints incorporate both boundary and region information. Graph cuts are used to find the globally optimal segmentation of the N-dimensional image. The obtained solution gives the best balance of boundary and region properties among all segmentations satisfying the constraints. The topology of our segmentation is unrestricted and both "object" and "background" segments may consist of several isolated parts. Some experimental results are presented in the context of photo/video editing and medical image segmentation. We also demonstrate an interesting Gestalt example. A fast implementation of our segmentation method is possible via a new max-flow algorithm.
TL;DR: A new technique for general purpose interactive segmentation of N-dimensional images where the user marks certain pixels as "object" or "background" to provide hard constraints for segmentation.
Abstract: In this paper we describe a new technique for general purpose interactive segmentation of N-dimensional images. The user marks certain pixels as “object” or “background” to provide hard constraints for segmentation. Additional soji constraints incorporate both boundary and region information. Graph cuts are used to find the globally optimal segmentation of the N-dimensional image. The obtained solution gives the best balance of boundary and region properties among all segmentations satishing the constraints. The topology o$our segmentation is unrestricted and both “object” and “background” segments may consist of several isolatedparts. Some experimental results are presented in the context ofphotohideo editing and medical image segmentation. We also demonstrate an interesting Gestalt example. A fast implementation of our segmentation method is possible via a new mar-$ow algorithm in [2].
TL;DR: The focus of this work is on spatial segmentation, where a criterion for "good" segmentation using the class-map is proposed and applying the criterion to local windows in theclass-map results in the "J-image," in which high and low values correspond to possible boundaries and interiors of color-texture regions.
Abstract: A method for unsupervised segmentation of color-texture regions in images and video is presented. This method, which we refer to as JSEG, consists of two independent steps: color quantization and spatial segmentation. In the first step, colors in the image are quantized to several representative classes that can be used to differentiate regions in the image. The image pixels are then replaced by their corresponding color class labels, thus forming a class-map of the image. The focus of this work is on spatial segmentation, where a criterion for "good" segmentation using the class-map is proposed. Applying the criterion to local windows in the class-map results in the "J-image," in which high and low values correspond to possible boundaries and interiors of color-texture regions. A region growing method is then used to segment the image based on the multiscale J-images. A similar approach is applied to video sequences. An additional region tracking scheme is embedded into the region growing process to achieve consistent segmentation and tracking results, even for scenes with nonrigid object motion. Experiments show the robustness of the JSEG algorithm on real images and video.
TL;DR: An algorithm based on mathematical morphology and curvature evaluation for the detection of vessel-like patterns in a noisy environment is presented and its robustness and its accuracy with respect to noise are evaluated.
Abstract: This paper presents an algorithm based on mathematical morphology and curvature evaluation for the detection of vessel-like patterns in a noisy environment. Such patterns are very common in medical images. Vessel detection is interesting for the computation of parameters related to blood flow. Its tree-like geometry makes it a usable feature for registration between images that can be of a different nature. In order to define vessel-like patterns, segmentation is performed with respect to a precise model. We define a vessel as a bright pattern, piece-wise connected, and locally linear, mathematical morphology is very well adapted to this description, however other patterns fit such a morphological description. In order to differentiate vessels from analogous background patterns, a cross-curvature evaluation is performed. They are separated out as they have a specific Gaussian-like profile whose curvature varies smoothly along the vessel. The detection algorithm that derives directly from this modeling is based on four steps: (1) noise reduction; (2) linear pattern with Gaussian-like profile improvement; (3) cross-curvature evaluation; (4) linear filtering. We present its theoretical background and illustrate it on real images of various natures, then evaluate its robustness and its accuracy with respect to noise.
TL;DR: It is proved that the Normalized Cut method arises naturally from the framework and a complete characterization of the cases when the Normalization Cut algorithm is exact is provided.
Abstract: We present a new view of clustering and segmentation by pairwise similarities. We interpret the similarities as edge ows in a Markov random walk and study the eigenvalues and eigenvectors of the walk's transition matrix. This view shows that spectral methods for clustering and segmentation have a probabilistic foundation. We prove that the Normalized Cut method arises naturally from our framework and we provide a complete characterization of the cases when the Normalized Cut algorithm is exact. Then we discuss other spectral segmentation and clustering methods showing that several of them are essentially the same as NCut.
TL;DR: An approach that fuses images with diverse focuses by first decomposing the source images into blocks and then combining them by the use of spatial frequency is presented.
TL;DR: An early review of the largely unknown territory of human-computer interaction in image segmentation is presented to identify patterns in the use of interaction and to develop qualitative criteria to evaluate interactive segmentation methods.
TL;DR: A robust segmentation algorithm by incorporating such techniques as dimension correction, model selection using the geometric AIC, and least-median fitting is presented, demonstrating that oar algorithm dramatically outperforms existing methods.
Abstract: Reformulating the Costeira-Kanade algorithm as a pure mathematical theorem independent of the Tomasi-Kanade factorization, we present a robust segmentation algorithm by incorporating such techniques as dimension correction, model selection using the geometric AIC, and least-median fitting. Doing numerical simulations, we demonstrate that oar algorithm dramatically outperforms existing methods. It does not involve any parameters which need to be adjusted empirically.
TL;DR: Preliminary studies showed that the new tool could significantly improve intra- and inter-rater reliability of hippocampus segmentation to achieve intra-class correlation coefficients significantly higher than published elsewhere.
Abstract: Extracting 3D structures from volumetric images like MRI or CT is becoming a routine process for diagnosis based on quantitation, for radiotherapy planning, for surgical planning and image-guided intervention, for studying neurodevelopmental and neurodegenerative aspects of brain diseases, and for clinical drug trials. Key issues for segmenting anatomical objects from 3D medical images are validity and reliability. We have developed VALMET, a new tool for validation and comparison of object segmentation. New features not available in commercial and public-domain image processing packages are the choice between different metrics to describe differences between segmentations and the use of graphical overlay and 3D display for visual assessment of the locality and magnitude of segmentation variability. Input to the tool are an original 3D image (MRI, CT, ultrasound), and a series of segmentations either generated by several human raters and/or by automatic methods (machine). Quantitative evaluation includes intra-class correlation of resulting volumes and four different shape distance metrics, a) percentage overlap of segmented structures (R intersect S)/(R union S), b) probabilistic overlap measure for non-binary segmentations, c) mean/median absolute distances between object surfaces, and maximum (Hausdorff) distance. All these measures are calculated for arbitrarily selected 2D cross-sections and full 3D segmentations. Segmentation results are overlaid onto the original image data for visual comparison. A 3D graphical display of the segmented organ is color-coded depending on the selected metric for measuring segmentation difference. The new tool is in routine use for intra- and inter-rater reliability studies and for testing novel automatic machine-segmentation versus a gold standard established by human experts. Preliminary studies showed that the new tool could significantly improve intra- and inter-rater reliability of hippocampus segmentation to achieve intra-class correlation coefficients significantly higher than published elsewhere.
TL;DR: In this paper, a region growing algorithm that learns its homogeneity criterion automatically from characteristics of the region to be segmented was developed, based on a model that describes homogeneity and simple shape properties of the regions.
Abstract: Interaction increases flexibility of segmentation but it leads to undesirable behavior of an algorithm if knowledge being requested is inappropriate. In region growing, this is the case for defining the homogeneity criterion as its specification depends also on image formation properties that are not known to the user. We developed a region growing algorithm that learns its homogeneity criterion automatically from characteristics of the region to be segmented. The method is based on a model that describes homogeneity and simple shape properties of the region. Parameters of the homogeneity criterion are estimated from sample locations in the region. These locations are selected sequentially in a random walk starting at the seed point, and the homogeneity criterion is updated continuously. The method was tested for segmentation on test images and of structures in CT images. We found the method to work reliable if the model assumption on homogeneity and region characteristics are true. Furthermore, the model is simple but robust, thus allowing for a certain degree of deviation from model constraints and still delivering the expected segmentation result. This approach was extended to a fully automatic and complete segmentation method by using the pixels with the smallest gradient length in the not yet segmented image region as a seed point.
TL;DR: A statistical method is proposed that finds the maximum-probability segmentation of a given text that does not require training data and can be applied to any text in any domain.
Abstract: We propose a statistical method that finds the maximum-probability segmentation of a given text. This method does not require training data because it estimates probabilities from the given text. Therefore, it can be applied to any text in any domain. An experiment showed that the method is more accurate than or at least as accurate as a state-of-the-art text segmentation system.
TL;DR: This paper presents the first automatic segmentation method which separates non-enhancing brain tumors from healthy tissues in MR images to aid in the task of tracking tumor size over time.
TL;DR: In this paper, the authors show how the piecewise-smooth Mumford-Shah segmentation problem can be solved using the level set method, which can be simultaneously used to denoise, segment, detect-extract edges, and perform active contours.
Abstract: We show how the piecewise-smooth Mumford-Shah segmentation problem can be solved using the level set method of Osher and Sethian (1988). The obtained algorithm can be simultaneously used to denoise, segment, detect-extract edges, and perform active contours. The proposed model is also a generalisation of a previous active contour model without edges, proposed by the authors in Chan et al., (2001), and of its extension to the case with more than two segments for piecewise-constant segmentation Chan et al., (2000). Based on the four color theorem, we can assume that in general, at most two level set functions are sufficient to detect and represent distinct objects of distinct intensities, with triple junctions, or T-junctions.
TL;DR: A fully automatic anatomical, pathological, and functional segmentation of the liver derived from a spiral CT scan is developed to improve the planning of hepatic surgery.
Abstract: Objective: To improve the planning of hepatic surgery, we have developed a fully automatic anatomical, pathological, and functional segmentation of the liver derived from a spiral CT scan.Materials and Methods: From a 2 mm-thick enhanced spiral CT scan, the first stage automatically delineates skin, bones, lungs, kidneys, and spleen by combining the use of thresholding, mathematical morphology, and distance maps. Next, a reference 3D model is immersed in the image and automatically deformed to the liver contours. Then an automatic Gaussian fitting on the imaging histogram estimates the intensities of parenchyma, vessels, and lesions. This first result is next improved through an original topological and geometrical analysis, providing an automatic delineation of lesions and veins. Finally, a topological and geometrical analysis based on medical knowledge provides hepatic functional information that is invisible in medical imaging: portal vein labeling and hepatic anatomical segmentation according to the C...
TL;DR: The algorithm for vessel detection consists in contrast enhancement, application of the morphological top-hat-transform and a post-filtering step in order to distinguish the vessels from other blood containing features.
Abstract: This paper presents new algorithms based on mathematical morphology for the detection of the optic disc and the vascular tree in noisy low contrast color fundus photographs. Both features - vessels and optic disc - deliver landmarks for image registration and are indispensable to the understanding of retinal fundus images. For the detection of the optic disc, we first find the position approximately. Then we find the exact contours by means of the watershed transformation. The algorithm for vessel detection consists in contrast enhancement, application of the morphological top-hat-transform and a post-filtering step in order to distinguish the vessels from other blood containing features.
TL;DR: This paper considers context recognition by unsupervised segmentation of time series produced by sensors, and uses global iterative replacement or GIR, which gives approximately optimal results in a fraction of the time required by dynamic programming.
Abstract: Recognizing the context of use is important in making mobile devices as simple to use as possible. Finding out what the user's situation is can help the device and underlying service in providing an adaptive and personalized user interface. The device can infer parts of the context of the user from sensor data: the mobile device can include sensors for acceleration, noise level, luminosity, humidity, etc. In this paper we consider context recognition by unsupervised segmentation of time series produced by sensors. Dynamic programming can be used to find segments that minimize the intra-segment variances. While this method produces optimal solutions, it is too slow for long sequences of data. We present and analyze randomized variations of the algorithm. One of them, global iterative replacement or GIR, gives approximately optimal results in a fraction of the time required by dynamic programming. We demonstrate the use of time series segmentation in context recognition for mobile phone applications.
TL;DR: An integrated human shape modeling, detection, and body part localization vision system that can detect pedestrians in various shapes, sizes, postures, partial occlusion, and clothing from a moving vehicle using stereo cameras and proposes a recursive context reasoning algorithm to solve the above dilemma.
Abstract: This dissertation presents an integrated human shape modeling, detection, and body part localization vision system. It demonstrates that the system can (1) detect pedestrians in various shapes, sizes, postures, partial occlusion, and clothing from a moving vehicle using stereo cameras; (2) locate the joints of a person automatically and accurately without employing any markers around the joints.
The following contributions distinguish this dissertation from previous work: (1) Dressed human modeling and dynamic model assembling: Unlike previous work that employs a fixed human body model or global deformable template to perform human detection, in this dissertation merged body parts are introduced to represent the deformations caused by clothing, segmentation errors, or low image resolution. A dressed human model is dynamically assembled from the model parts in the recognition step; the shapes of the body parts and the size and spatial relationships between them (the contextual information) are represented as invariant under translation, rotation, and scaling. Therefore, the system can detect people in different clothes, positions, sizes, and orientations. (2) Bayesian similarity measure: A probabilistic similarity measure is derived from the human model that combines the local shape and global relationship constraints to guide body part identification and human detection. Thus, the identification of a part does not only depend on its own shape but also the contextual constraints from other parts. In contrast with previous work, the proposed similarity measure enables efficient shape matching and comparison robust to articulation, partial occlusion, and segmentation errors through coarse-to-fine human model assembling. (3) Recursive context reasoning algorithm: Contour-based human detection depends on reliable contour extraction, but contour extraction is an under-constrained problem without the knowledge about the objects to be detected. Unlike previous work that assumes perfect and complete contours are available, this dissertation proposes a recursive context reasoning (RCR) algorithm to solve the above dilemma. A contour updating procedure is introduced to integrate the human model and the identified body parts to predict the shapes and locations of the parts missed by the contour detector; the refined contours are used to reevaluate the Bayesian similarity measure and to determine if a person is present or not. Therefore, contour extraction, body part localization, and human detection are improved iteratively.
TL;DR: The presented data show rapid and parallel activation of different areas within complex neuronal networks, including early activity of brain regions remote from the primary sensory areas, and indicate information exchange between homologous areas of the two hemispheres in cases where unilateral stimulus presentation requires interhemispheric transfer.
TL;DR: It is shown that low-level features and mid-level view classes can be combined to extract more information about the game, via the example of detecting grass orientation in the field, and the best result in segmentation is 86.5%.
Abstract: In this paper, we present a novel system and effective algorithms for soccer video segmentation. The output, about whether the ball is in play, reveals high-level structure of the content. The first step is to classify each sample frame into 3 kinds of view using a unique domain-specific feature, grass-area-ratio. Here the grass value and classification rules are learned and automatically adjusted to each new clip. Then heuristic rules are used in processing the view label sequence, and obtain play/break status of the game. The results provide good basis for detailed content analysis in next step. We also show that low-level features and mid-level view classes can be combined to extract more information about the game, via the example of detecting grass orientation in the field. The results are evaluated under different metrics intended for different applications; the best result in segmentation is 86.5%.
TL;DR: This work shows how to find people by finding candidate body segments, and then constructing assemblies of segments that are consistent with the constraints on the appearance of a person that result from kinematic properties, using an efficient projection algorithm for one popular classifier.
Abstract: Finding people in pictures presents a particularly difficult object recognition problem. We show how to find people by finding candidate body segments, and then constructing assemblies of segments that are consistent with the constraints on the appearance of a person that result from kinematic properties. Since a reasonable model of a person requires at least nine segments, it is not possible to inspect every group, due to the huge combinatorial complexity.
We propose two approaches to this problem. In one, the search can be pruned by using projected versions of a classifier that accepts groups corresponding to people. We describe an efficient projection algorithm for one popular classifier, and demonstrate that our approach can be used to determine whether images of real scenes contain people.
The second approach employs a probabilistic framework, so that we can draw samples of assemblies, with probabilities proportional to their likelihood, which allows to draw human-like assemblies more often than the non-person ones. The main performance problem is in segmentation of images, but the overall results of both approaches on real images of people are encouraging.
TL;DR: In this paper, the authors used cluster analysis segments were formed based on combinations of customer ratings for different attitudinal dimensions and benefits of bank service, and four characteristic groups of customers were identified showing special preferences for and against information services and technology.
Abstract: Segmentation by demographic factors is widely used in bank marketing despite the fact that the correlation of such factors with the needs of customers is often weak. Segmentation by expected benefits and attitudes could enhance a bank’s ability to address the conflict between individual service and cost‐saving standardisation. Using cluster analysis segments were formed based on combinations of customer ratings for different attitudinal dimensions and benefits of bank service. The clusters generated in this way were superior in their homogeneity and profile to customer segments gained by referring to demographic differences. Additionally, four characteristic groups of customers were identified showing special preferences for and against information services and technology.
TL;DR: A number of evolutionary agents are uniformly distributed in the 2-D image environment to detect the skin-like pixels and segment each face-like region by activating their evolutionary behaviors, and wavelet decomposition is applied to each region to detect possible facial features.
TL;DR: A novel multiresolution image segmentation algorithm designed to separate a sharply focused object-of-interest from other foreground or background objects and provides better accuracy at higher speed.
Abstract: Unsupervised segmentation of images with low depth of field (DOF) is highly useful in various applications. This paper describes a novel multiresolution image segmentation algorithm for low DOF images. The algorithm is designed to separate a sharply focused object-of-interest from other foreground or background objects. The algorithm is fully automatic in that all parameters are image independent. A multi-scale approach based on high frequency wavelet coefficients and their statistics is used to perform context-dependent classification of individual blocks of the image. Unlike other edge-based approaches, our algorithm does not rely on the process of connecting object boundaries. The algorithm has achieved high accuracy when tested on more than 100 low DOF images, many with inhomogeneous foreground or background distractions. Compared with he state of the art algorithms, this new algorithm provides better accuracy at higher speed.
TL;DR: In this article, a systematic approach is proposed to automatically extract geometric surface features from a point cloud composed of a set of unorganized three-dimensional coordinate points by data segmentation.
Abstract: A systematic approach is proposed to automatically extract geometric surface features from a point cloud composed of a set of unorganized three-dimensional coordinate points by data segmentation. The point cloud is sampled from the boundary surface of a mechanical component with arbitrary shape. The proposed approach is composed of three steps. In the first step, a mesh surface domain is reconstructed to establish an explicit topological relation among the discrete points. The topological adjacency is further optimized to recover the second order object geometry. In the second step, curvature-based border detection is applied on the irregular mesh to extract both sharp borders with tangent discontinuity and smooth borders with curvature discontinuity. Finally, the mesh patches separated by the extracted borders are grouped together in the third step. For objects with complex shape, a multilevel segmentation scheme is proposed for better results. The capability of the proposed approach is demonstrated using various point clouds having distinct characteristics. Integrated with state of art scanning devices, the developed segmentation scheme can support reverse engineering of high precision mechanical components. It has potential applications in a whole spectrum of engineering problems with a major impact on rapid design and prototyping, shape analysis, and virtual reality.
TL;DR: It is shown that for multiband images, multithresholding subsets of bands followed by a fusion stage results in improved performance and running time.
TL;DR: Common approaches including temporal segmentsation, spatial segmentation, and the combination of temporal-spatial segmentation are described, which are very important in many aspects of multimedia applications.
Abstract: Segmentation of objects in image sequences is very important in many aspects of multimedia applications. In second-generation image/video coding, images are segmented into objects to achieve efficient compression by coding the contour and texture separately. As the purpose is to achieve high compression performance, the objects segmented may not be semantically meaningful to human observers. The more recent applications, such as content-based image/video retrieval and image/video composition, require that the segmented objects be semantically meaningful. Indeed, the recent multimedia standard MPEG-4 specifies that a video is composed of meaningful video objects. Although many segmentation techniques have been proposed in the literature, fully automatic segmentation tools for general applications are currently not achievable. This paper provides a review of this important and challenging area of segmentation of moving objects. We describe common approaches including temporal segmentation, spatial segmentation, and the combination of temporal-spatial segmentation. As an example, a complete segmentation scheme, which is an informative part of MPEG-4, is summarized.
TL;DR: A spatial/symbolic model of the inner organs is developed, which is based on more than 1000 cryosections and congruent fresh and frozen CT images of the male Visible Human, offering an unsurpassed photorealistic presentation and level of detail.
TL;DR: The average brain of the domestic pig is used as a target for registration of dynamic PET data, so that time-activity curves can be extracted from standard volumes of interest.