TL;DR: A parametric model for an implicit representation of the segmenting curve is derived by applying principal component analysis to a collection of signed distance representations of the training data to minimize an objective function for segmentation.
Abstract: We propose a shape-based approach to curve evolution for the segmentation of medical images containing known object types. In particular, motivated by the work of Leventon, Grimson, and Faugeras (2000), we derive a parametric model for an implicit representation of the segmenting curve by applying principal component analysis to a collection of signed distance representations of the training data. The parameters of this representation are then manipulated to minimize an objective function for segmentation. The resulting algorithm is able to handle multidimensional data, can deal with topological changes of the curve, is robust to noise and initial contour placements, and is computationally efficient. At the same time, it avoids the need for point correspondences during the training phase of the algorithm. We demonstrate this technique by applying it to two medical applications; two-dimensional segmentation of cardiac magnetic resonance imaging (MRI) and three-dimensional segmentation of prostate MRI.
TL;DR: Results show how visual categorization based directly on low-level features, without grouping or segmentation stages, can benefit object localization and identification.
Abstract: In this paper we study the statistical properties of natural images belonging to different categories and their relevance for scene and object categorization tasks. We discuss how second-order statistics are correlated with image categories, scene scale and objects. We propose how scene categorization could be computed in a feedforward manner in order to provide top-down and contextual information very early in the visual processing chain. Results show how visual categorization based directly on low-level features, without grouping or segmentation stages, can benefit object localization and identification. We show how simple image statistics can be used to predict the presence and absence of objects in the scene before exploring the image.
TL;DR: The aim is to reframe the multiresolution-based fusion methodology into a common formalism and to develop a new region-based approach which combines aspects of both object and pixel-level fusion.
TL;DR: A general framework for parsing images into regions and objects, which makes use of bottom-up proposals combined with top-down generative models using the data driven Markov chain Monte Carlo algorithm, which is guaranteed to converge to the optimal estimate asymptotically.
Abstract: We propose a general framework for parsing images into regions and objects. In this framework, the detection and recognition of objects proceed simultaneously with image segmentation in a competitive and cooperative manner. We illustrate our approach on natural images of complex city scenes where the objects of primary interest are faces and text. This method makes use of bottom-up proposals combined with top-down generative models using the data driven Markov chain Monte Carlo (DDMCMC) algorithm, which is guaranteed to converge to the optimal estimate asymptotically. More precisely, we define generative models for faces, text, and generic regions- e.g. shading, texture, and clutter. These models are activated by bottom-up proposals. The proposals for faces and text are learnt using a probabilistic version of AdaBoost. The DDMCMC combines reversible jump and diffusion dynamics to enable the generative models to explain the input images in a competitive and cooperative manner. Our experiments illustrate the advantages and importance of combining bottom-up and top-down models and of performing segmentation and object detection/recognition simultaneously.
TL;DR: Analysis of multispectral or multitemporal images requires proper geometric alignment of the images to compare corresponding regions in each image volume, and voxel-based approaches consider all voxels in the image without the need for segmentation.
Abstract: Analysis of multispectral or multitemporal images requires proper geometric alignment of the images to compare corresponding regions in each image volume. Retrospective three-dimensional alignment or registration of multimodal medical images based on features intrinsic to the image data itself is complicated by their different photometric properties, by the complexity of the anatomical objects in the scene and by the large variety of clinical applications in which registration is involved. While the accuracy of registration approaches based on matching of anatomical landmarks or object surfaces suffers from segmentation errors, voxel-based approaches consider all voxels in the image without the need for segmentation. The recent introduction of the criterion of maximization of mutual information, a basic concept from information theory, has proven to be a breakthrough in the field. While solutions for intrapatient affine registration based on this concept are already commercially available, current research in the field focuses on interpatient nonrigid matching.
TL;DR: A domain-independent topic segmentation algorithm for multi-party speech that combines knowledge about content using a text-based algorithm as a feature and about form using linguistic and acoustic cues about topic shifts extracted from speech.
Abstract: We present a domain-independent topic segmentation algorithm for multi-party speech. Our feature-based algorithm combines knowledge about content using a text-based algorithm as a feature and about form using linguistic and acoustic cues about topic shifts extracted from speech. This segmentation algorithm uses automatically induced decision rules to combine the different features. The embedded text-based algorithm builds on lexical cohesion and has performance comparable to state-of-the-art algorithms based on lexical information. A significant error reduction is obtained by combining the two knowledge sources.
TL;DR: By incorporating the atlas information into the Bayesian framework, segmentation results clearly showed improvements over a standard unsupervised segmentation method.
Abstract: There have been significant efforts to build a probabilistic atlas of the brain and to use it for many common applications, such as segmentation and registration. Though the work related to brain atlases can be applied to nonbrain organs, less attention has been paid to actually building an atlas for organs other than the brain. Motivated by the automatic identification of normal organs for applications in radiation therapy treatment planning, we present a method to construct a probabilistic atlas of an abdomen consisting of four organs (i.e., liver, kidneys, and spinal cord). Using 32 noncontrast abdominal computed tomography (CT) scans, 31 were mapped onto one individual scan using thin plate spline as the warping transform and mutual information (MI) as the similarity measure. Except for an initial coarse placement of four control points by the operators, the MI-based registration was automatic. Additionally, the four organs in each of the 32 CT data sets were manually segmented. The manual segmentations were warped onto the "standard" patient space using the same transform computed from their gray scale CT data set and a probabilistic atlas was calculated. Then, the atlas was used to aid the segmentation of low-contrast organs in an additional 20 CT data sets not included in the atlas. By incorporating the atlas information into the Bayesian framework, segmentation results clearly showed improvements over a standard unsupervised segmentation method.
TL;DR: This work poses the problem of segmenting individual humans in crowded situations from stationary video camera sequences as a "model-based segmentation" problem in which human shape models are used to interpret the foreground in a Bayesian framework using an efficient Markov chain Monte Carlo method.
Abstract: The problem of segmenting individual humans in crowded situations from stationary video camera sequences is exacerbated by object inter-occlusion. We pose this problem as a "model-based segmentation" problem in which human shape models are used to interpret the foreground in a Bayesian framework. The solution is obtained by using an efficient Markov chain Monte Carlo (MCMC) method that uses domain knowledge as proposal probabilities. Knowledge of various aspects including human shape, human height, camera model, and image cues including human head candidates, foreground/background separation are integrated in one theoretically sound framework. We show promising results and evaluations on some challenging data.
TL;DR: This work has shown that tight clustering of nuclei in 3D confocal microscope images is a common source of segmentation error, and a compelling need to minimize these errors for constructing highly automated scoring systems.
TL;DR: A new cost function, cut ratio, for segmenting images using graph-based methods that allows the image perimeter to be segmented, guarantees that the segments produced by bipartitioning are connected, and does not introduce a size, shape, smoothness, or boundary-length bias.
Abstract: This paper proposes a new cost function, cut ratio, for segmenting images using graph-based methods. The cut ratio is defined as the ratio of the corresponding sums of two different weights of edges along the cut boundary and models the mean affinity between the segments separated by the boundary per unit boundary length. This new cost function allows the image perimeter to be segmented, guarantees that the segments produced by bipartitioning are connected, and does not introduce a size, shape, smoothness, or boundary-length bias. The latter allows it to produce segmentations where boundaries are aligned with image edges. Furthermore, the cut-ratio cost function allows efficient iterated region-based segmentation as well as pixel-based segmentation. These properties may be useful for some image-segmentation applications. While the problem of finding a minimum ratio cut in an arbitrary graph is NP-hard, one can find a minimum ratio cut in the connected planar graphs that arise during image segmentation in polynomial time. While the cut ratio, alone, is not sufficient as a baseline method for image segmentation, it forms a good basis for an extended method of image segmentation when combined with a small number of standard techniques. We present an implemented algorithm for finding a minimum ratio cut, prove its correctness, discuss its application to image segmentation, and present the results of segmenting a number of medical and natural images using our techniques.
TL;DR: This work examines a maximum a posteriori decoding strategy for feature-based recognizers and develops a normalization criterion useful for a segment-based speech recognizer.
TL;DR: A hierarchical deformation strategy is employed, in which the model adaptively focuses on the similarity of different Gabor features at different deformation stages using a multiresolution technique, i.e., coarse features first and fine features later.
Abstract: Presents a statistical shape model for the automatic prostate segmentation in transrectal ultrasound images. A Gabor filter bank is first used to characterize the prostate boundaries in ultrasound images in both multiple scales and multiple orientations. The Gabor features are further reconstructed to be invariant to the rotation of the ultrasound probe and incorporated in the prostate model as image attributes for guiding the deformable segmentation. A hierarchical deformation strategy is then employed, in which the model adaptively focuses on the similarity of different Gabor features at different deformation stages using a multiresolution technique, i.e., coarse features first and fine features later. A number of successful experiments validate the algorithm.
TL;DR: The tracking module presented here is divided into the following three procedures: segmentation, matching and prediction, which is important to limit the region of processing, thus reducing the execution time.
TL;DR: Applications of the nonlinear shape statistics in segmentation and tracking of 2D and 3D objects demonstrate that the segmentation process can incorporate knowledge on a large variety of complex real-world shapes.
TL;DR: Experiments on a database composed of about 4- hour audio data show that the proposed classifier is very efficient on audio classification and segmentation and shows the accuracy of the SVM-based method is much better than the method based on KNN and GMM.
Abstract: Content-based audio classification and segmentation is a basis for further audio/video analysis. In this paper, we present our work on audio segmentation and classification which employs support vector machines (SVMs). Five audio classes are considered in this paper: silence, music, background sound, pure speech, and non- pure speech which includes speech over music and speech over noise. A sound stream is segmented by classifying each sub-segment into one of these five classes. We have evaluated the performance of SVM on different audio type-pairs classification with testing unit of different- length and compared the performance of SVM, K-Nearest Neighbor (KNN), and Gaussian Mixture Model (GMM). We also evaluated the effectiveness of some new proposed features. Experiments on a database composed of about 4- hour audio data show that the proposed classifier is very efficient on audio classification and segmentation. It also shows the accuracy of the SVM-based method is much better than the method based on KNN and GMM.
TL;DR: A novel grouping method is proposed, which stresses connectedness of image elements via mediating elements rather than favoring high mutual similarity, which yields superior clustering results when objects are distributed on low-dimensional extended manifolds in a feature space, and not as local point clouds.
Abstract: Perceptual grouping organizes image parts in clusters based on psychophysically plausible similarity measures. We propose a novel grouping method in this paper, which stresses connectedness of image elements via mediating elements rather than favoring high mutual similarity. This grouping principle yields superior clustering results when objects are distributed on low-dimensional extended manifolds in a feature space, and not as local point clouds. In addition to extracting connected structures, objects are singled out as outliers when they are too far away from any cluster structure. The objective function for this perceptual organization principle is optimized by a fast agglomerative algorithm. We report on perceptual organization experiments where small edge elements are grouped to smooth curves. The generality of the method is emphasized by results from grouping textured images with texture gradients in an unsupervised fashion.
TL;DR: A novel method for the categorization of unfamiliar objects in difficult real-world scenes is presented, which uses a probabilistic formulation to incorporate knowledge about the recognized category as well as the supporting information in the image to segment the object from the background.
Abstract: Historically, figure-ground segmentation has been seen as an important and even necessary precursor for object recognition In that context, segmentation is mostly defined as a data driven, that is bottom-up, process As for humans object recognition and segmentation are heavily intertwined processes, it has been argued that top-down knowledge from object recognition can and should be used for guiding the segmentation process In this paper, we present a method for the categorization of unfamiliar objects in difficult real-world scenes The method generates object hypotheses without prior segmentation that can be used to obtain a category-specific figure-ground segmentation In particular, the proposed approach uses a probabilistic formulation to incorporate knowledge about the recognized category as well as the supporting information in the image to segment the object from the background This segmentation can then be used for hypothesis verification, to further improve recognition performance Experimental results show the capacity of the approach to categorize and segment object categories as diverse as cars and cows
TL;DR: An automated system has been developed for brain tumor segmentation that will provide objective, reproducible segmentations that are close to the manual results, although its performance is below that of the semi-automated method.
TL;DR: This work presents a fully automated deformable model technique for myocardium segmentation in 3D MRI and investigates the extension to 4D by incorporating a constraint on the allowed deformation based on a learned example and shows illustrative results for 4D MRI.
Abstract: We present a fully automated deformable model technique for myocardium segmentation in 3D MRI. Loss of signal due to blood flow, partial volume effects and significant variation of surface grey value appearance make this a difficult problem. We integrate various sources of prior knowledge learned from annotated image data into a deformable model. Inter-individual shape variation is represented by a statistical point distribution model, and the spatial relationship of the epi- and endocardium is modeled by adapting two coupled triangular surface meshes. To robustly accommodate variation of grey value appearance around the myocardiac surface, a prior parametric spatially varying feature model is established by classification of grey value surface profiles. Quantitative validation of 60 end-diastolic 3D MRI datasets demonstrates accuracy and robustness, with 1.28±0.81 mm mean deviation from manual segmentation. We investigate the extension to 4D by incorporating a constraint on the allowed deformation based on a learned example and show illustrative results for 4D MRI.
TL;DR: In this article, a video detection and monitoring method and apparatus utilizes an application-specific object based segmentation and recognition system for locating and tracking an object of interest within a number of sequential frames of data collected by a video camera or similar device.
Abstract: A video detection and monitoring method and apparatus utilizes an application-specific object based segmentation and recognition system for locating and tracking an object of interest within a number of sequential frames of data collected by a video camera or similar device. One embodiment includes a background modeling and object segmentation module to isolate from a current frame at least one segment of the current frame containing a possible object of interest, and a classification module adapted to determine whether or not any segment of the output from the background modeling apparatus includes an object of interest and to characterize any such segment as an object segment. An object segment tracking apparatus is adapted to track the location within a current frame of any object segment and to determine a projected location of the object segment in a subsequent frame.
TL;DR: In this paper, three-dimensional position information is used to segment objects in a scene viewed by a 3D camera and at one or more instances of an interval, the head location of the user is determined.
Abstract: Three-dimensional position information is used to segment objects in a scene viewed by a three dimensional camera. At one or more instances of an interval, the head location of the user is determined. Object-based compression schemes are applied on the segmented objects and the detected head.
TL;DR: The performance of this segmentation algorithm is superior to traditional single resolution techniques such as texture spectrum, co-occurrences, local linear transforms, etc.
TL;DR: This paper reviews various segmentation proposals that integrate edge and region information and highlights different strategies and methods for fusing such information.
TL;DR: The proposed technique discriminates well between patterns of obstructive lung disease on the basis of parenchymal texture alone and was tested with a new cohort of subjects with a similar spectrum of diseases.
Abstract: An automated technique for differentiation between a variety of obstructive lung diseases on the basis of textural analysis of thin-section computed tomographic (CT) images is described. From four regions of interest on each image, local texture information was extracted and represented by a 13-dimensional vector that contained statistical moments of the CT attenuation distribution, acquisition-length parameters, and co-occurrence descriptors. A supervised Bayesian classifier was used for texture feature segmentation. The technique was tested with a new cohort of subjects (n = 33, 660 regions of interest) with a similar spectrum of diseases. The proposed technique discriminates well between patterns of obstructive lung disease on the basis of parenchymal texture alone.
TL;DR: A new Bayesian formulation forParametric image segmentation is presented, based on the key idea of using a doubly stochastic prior model for the label field, which allows one to find exact optimal estimators for both this field and the model parameters by the minimization of a differentiable function.
Abstract: Parametric image segmentation consists of finding a label field that defines a partition of an image into a set of nonoverlapping regions and the parameters of the models that describe the variation of some property within each region. A new Bayesian formulation for the solution of this problem is presented, based on the key idea of using a doubly stochastic prior model for the label field, which allows one to find exact optimal estimators for both this field and the model parameters by the minimization of a differentiable function. An efficient minimization algorithm and comparisons with existing methods on synthetic images are presented, as well as examples of realistic applications to the segmentation of Magnetic Resonance volumes and to motion segmentation.
TL;DR: A high-resolution black and white orthophoto and a subscene of a Landsat Thematic Mapper image have been used to obtain an object-oriented classification of the land cover of a study area in northern Italy.
Abstract: Object-oriented classification techniques based on image segmentation are gaining interest as methods for producing output maps directly storable into Geophysical Information System (GIS) databases. A limitation in efficiently applying image segmentation is often represented by the spatial resolution of the image. This contribution proposes a method for overcoming this problem, based on the integrated use of images of different resolution. A high-resolution black and white (b/w) orthophoto and a subscene of a Landsat Thematic Mapper (TM) image have been used to obtain an object-oriented classification of the land cover of a study area in northern Italy. The method consists of a sequential application of segmentation and classification techniques. First, the TM image was classified using the maximum likelihood classifier and additional empirical rules. Subsequently, the orthophoto was segmented by applying a region-based segmentation algorithm. Finally, the classification of the segmented images was perfor...
TL;DR: Experimental results show that the determinant of the covariance matrix appears to be a very relevant tool for segmentation of homogeneous color regions for image and video segmentation using active contours.
Abstract: This paper deals with image and video segmentation using active contours. We propose a general form for the energy functional related to region-based active contours. We compute the associated evolution equation using shape derivation tools and accounting for the evolving region-based terms. Then we apply this general framework to compute the evolution equation from functionals that include various statistical measures of homogeneity for the region to be segmented. Experimental results show that the determinant of the covariance matrix appears to be a very relevant tool for segmentation of homogeneous color regions. As an example, it has been successfully applied to face segmentation in real video sequences.
TL;DR: The most frequently used approach-based on a modified Hidden Markov Model (HMM) phonetic recognizer is analyzed, and a general framework for the local refinement of boundaries is proposed, and the performance of several pattern classification approaches is compared within this framework.
Abstract: This paper presents the results and conclusions of a thorough study on automatic phonetic segmentation. It starts with a review of the state of the art in this field. Then, it analyzes the most frequently used approach-based on a modified Hidden Markov Model (HMM) phonetic recognizer. For this approach, a statistical correction procedure is proposed to compensate for the systematic errors produced by context-dependent HMMs, and the use of speaker adaptation techniques is considered to increase segmentation precision. Finally, this paper explores the possibility of locally refining the boundaries obtained with the former techniques. A general framework is proposed for the local refinement of boundaries, and the performance of several pattern classification approaches (fuzzy logic, neural networks and Gaussian mixture models) is compared within this framework. The resulting phonetic segmentation scheme was able to increase the performance of a baseline HMM segmentation tool from 27.12%, 79.27%, and 97.75% of automatic boundary marks with errors smaller than 5, 20, and 50 ms, respectively, to 65.86%, 96.01%, and 99.31% in speaker-dependent mode, which is a reasonably good approximation to manual segmentation.
TL;DR: The CKIP group of Academia Sinica participated in testing on open and closed tracks of Beijing University and Hong Kong Cityu and the evaluation results show the segmentation system performs very well in either HK open track or HK closed track and just acceptable in PK tracks.
Abstract: In this paper, we roughly described the procedures of our segmentation system, including the methods for resolving segmentation ambiguities and identifying unknown words. The CKIP group of Academia Sinica participated in testing on open and closed tracks of Beijing University (PK) and Hong Kong Cityu (HK). The evaluation results show our system performs very well in either HK open track or HK closed track and just acceptable in PK tracks. Some explanations and analysis are presented in this paper.
TL;DR: A survey of vessel extraction techniques and algorithms, putting the various approaches and techniques in perspective by means of a classification of the existing research, targeting mainly the extraction of blood vessels, neurovascular structure in particular.
Abstract: Vessel segmentation algorithms are critical components of circulatory blood vessel analysis systems. We present a survey of vessel extraction techniques and algorithms, putting the various approaches and techniques in perspective by means of a classification of the existing research. While we target mainly the extraction of blood vessels, neurovascular structure in particular we also review some of the segmentation methods for the tubular objects that show similar characteristics to vessels. We divide vessel segmentation algorithms and techniques into six main categories: (1) pattern recognition techniques, (2) model-based approaches, (3) tracking-based approaches, (4) artificial intelligence-based approaches, (5) neural network-based approaches, and (6) miscellaneous tube-like object detection approaches. Some of these categories are further divided into sub-categories. A table compares the papers against such criteria as dimensionality, input type, preprocessing, user interaction, and result type.