Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Pyramid (image processing)
  4. 2010
  1. Home
  2. Topics
  3. Pyramid (image processing)
  4. 2010
Showing papers on "Pyramid (image processing) published in 2010"
Proceedings Article•10.1145/1869790.1869829•
Bag-of-visual-words and spatial extensions for land-use classification

[...]

Yi Yang1, Shawn Newsam1•
University of California, Merced1
2 Nov 2010
TL;DR: This work considers a standard non-spatial representation in which the frequencies but not the locations of quantized image features are used to discriminate between classes analogous to how words are used for text document classification without regard to their order of occurrence, and considers two spatial extensions.
Abstract: We investigate bag-of-visual-words (BOVW) approaches to land-use classification in high-resolution overhead imagery. We consider a standard non-spatial representation in which the frequencies but not the locations of quantized image features are used to discriminate between classes analogous to how words are used for text document classification without regard to their order of occurrence. We also consider two spatial extensions, the established spatial pyramid match kernel which considers the absolute spatial arrangement of the image features, as well as a novel method which we term the spatial co-occurrence kernel that considers the relative arrangement. These extensions are motivated by the importance of spatial structure in geographic data.The methods are evaluated using a large ground truth image dataset of 21 land-use classes. In addition to comparisons with standard approaches, we perform extensive evaluation of different configurations such as the size of the visual dictionaries used to derive the BOVW representations and the scale at which the spatial relationships are considered.We show that even though BOVW approaches do not necessarily perform better than the best standard approaches overall, they represent a robust alternative that is more effective for certain land-use classes. We also show that extending the BOVW approach with our proposed spatial co-occurrence kernel consistently improves performance.

2,876 citations

Journal Article•10.1007/S11263-009-0268-3•
Gaussian Processes for Object Categorization

[...]

Ashish Kapoor1, Kristen Grauman2, Raquel Urtasun3, Trevor Darrell3•
Microsoft1, University of Texas at Austin2, University of California, Berkeley3
01 Jun 2010-International Journal of Computer Vision
TL;DR: This work shows that with an appropriate combination of kernels a significant boost in classification performance is possible, and indicates the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.
Abstract: Discriminative methods for visual object category recognition are typically non-probabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) provide a framework for deriving regression techniques with explicit uncertainty models; we show here how Gaussian Processes with covariance functions defined based on a Pyramid Match Kernel (PMK) can be used for probabilistic object category recognition. Our probabilistic formulation provides a principled way to learn hyperparameters, which we utilize to learn an optimal combination of multiple covariance functions. It also offers confidence estimates at test points, and naturally allows for an active learning paradigm in which points are optimally selected for interactive labeling. We show that with an appropriate combination of kernels a significant boost in classification performance is possible. Further, our experiments indicate the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.

226 citations

Journal Article•10.1109/TIFS.2009.2038751•
Face Verification Across Age Progression Using Discriminative Methods

[...]

Haibin Ling1, Stefano Soatto2, Narayanan Ramanathan, David W. Jacobs3•
Temple University1, University of California, Los Angeles2, University of Maryland, College Park3
01 Mar 2010-IEEE Transactions on Information Forensics and Security
TL;DR: It is found that the added difficulty of verification produced by age gaps becomes saturated after the gap is larger than four years, for gaps of up to ten years, and image quality and eyewear present more of a challenge than facial hair.
Abstract: Face verification in the presence of age progression is an important problem that has not been widely addressed. In this paper, we study the problem by designing and evaluating discriminative approaches. These directly tackle verification tasks without explicit age modeling, which is a hard problem by itself. First, we find that the gradient orientation, after discarding magnitude information, provides a simple but effective representation for this problem. This representation is further improved when hierarchical information is used, which results in the use of the gradient orientation pyramid (GOP). When combined with a support vector machine GOP demonstrates excellent performance in all our experiments, in comparison with seven different approaches including two commercial systems. Our experiments are conducted on the FGnet dataset and two large passport datasets, one of them being the largest ever reported for recognition tasks. Second, taking advantage of these datasets, we empirically study how age gaps and related issues (including image quality, spectacles, and facial hair) affect recognition algorithms. We found surprisingly that the added difficulty of verification produced by age gaps becomes saturated after the gap is larger than four years, for gaps of up to ten years. In addition, we find that image quality and eyewear present more of a challenge than facial hair.

217 citations

Journal Article•10.5120/1357-1832•
Implementation and Comparative Study of Image Fusion Algorithms

[...]

Shivsubramani Krishnamoorthy, K. P. Soman
11 Oct 2010-International Journal of Computer Applications
TL;DR: This paper discusses the implementation of three categories of image fusion algorithms – the basic fusion algorithms, the pyramid based algorithms and the basic DWT algorithms, developed as an Image Fusion Toolkit - ImFus, using Visual C++ 6.0.
Abstract: Image Fusion is a process of combining the relevant information from a set of images, into a single image, wherein the resultant fused image will be more informative and complete than any of the input images. This paper discusses the implementation of three categories of image fusion algorithms – the basic fusion algorithms, the pyramid based algorithms and the basic DWT algorithms, developed as an Image Fusion Toolkit - ImFus, using Visual C++ 6.0. The objective of the paper is to assess the wide range of algorithms together, which is not found in the literature. The fused images were assessed using Structural Similarity Image Metric (SSIM) [10], Laplacian Mean Squared Error along with seven other simple image quality metrics that helped us measure the various image features; which were also implemented as part of the toolkit. The readings produced by the image quality metrics, based on the image quality of the fused images, were used to assess the algorithms. We used Pareto Optimization method to figure out the algorithm that consistently had the image quality metrics produce the best readings. An assessment of the quality of the fused images was additionally performed with the help of ten respondents based on their visual perception, to verify the results produced by the metric based assessment. Coincidentally, both the assessment methods matched in their raking of the algorithms. The Pareto Optimization method picked DWT with Haar fusion method as the one with the best image quality metrics readings. The result here was substantiated by the visual perception based method where it was inferred that fused images produced by DWT with Haar fusion method was marked the best 63.33% of times which was far better than any other algorithm. Both the methods also matched in assessing Morphological Pyramid method as producing fused images of inferior quality.

143 citations

Proceedings Article•10.1109/BTAS.2010.5634507•
On matching sketches with digital face images

[...]

Himanshu Bhatt1, Samarth Bharadwaj1, Richa Singh1, Mayank Vatsa1•
Indraprastha Institute of Information Technology1
11 Nov 2010
TL;DR: In this article, a genetic optimization based approach is proposed to find the optimum weights corresponding to each facial region for matching, the information obtained from different levels of Laplacian pyramid are combined to improve the identification accuracy.
Abstract: This paper presents an efficient algorithm for matching sketches with digital face images. The algorithm extracts discriminating information present in local facial regions at different levels of granularity. Both sketches and digital images are decomposed into multi-resolution pyramid to conserve high frequency information which forms the discriminating facial patterns. Extended uniform circular local binary pattern based descriptors use these patterns to form a unique signature of the face image. Further, for matching, a genetic optimization based approach is proposed to find the optimum weights corresponding to each facial region. The information obtained from different levels of Laplacian pyramid are combined to improve the identification accuracy. Experimental results on sketch-digital image pairs from the CUHK and IIIT-D databases show that the proposed algorithm can provide better identification performance compared to existing algorithms.

103 citations

Journal Article•10.1016/J.IMAVIS.2009.06.012•
Adaptive pyramid mean shift for global real-time visual tracking

[...]

Shuxiao Li1, Hongxing Chang1, Chengfei Zhu1•
Chinese Academy of Sciences1
01 Mar 2010-Image and Vision Computing
TL;DR: A novel approach for global target tracking based on mean shift technique is proposed, termed as adaptive pyramid mean shift, because it uses the pyramid analysis technique and can determine the pyramid level adaptively to decrease the number of iterations required to achieve convergence.

64 citations

Journal Article•10.5120/1638-2202•
IRIS Recognition using Texture Features Extracted from Haarlet Pyramid

[...]

Dr.H.B. Kekre, Sudeep D. Thepade, Juhi Jain, Naman Agrawal
12 Oct 2010-International Journal of Computer Applications
TL;DR: The paper presents novel Haarlet Pyramid based iris recognition technique, which is done using the image feature set extracted from Haar Wavelets at various levels of decomposition, and shows that Haarlets level-5 outperforms other Haarles.
Abstract: Iris recognition has been a fast growing, challenging and interesting area in real-time applications. A large number of iris recognition algorithms have been developed for decades. The paper presents novel Haarlet Pyramid based iris recognition technique. Here iris recognition is done using the image feature set extracted from Haar Wavelets at various levels of decomposition. Analysis was performed of the proposed method, consisting of the False Acceptance Rate and the Genuine Acceptance Rate. The proposed technique is tested on an iris image database having 384 images. The results show that Haarlets level-5 outperforms other Haarlets, because the higher level Haarlets are giving very fine texture features while the lower level Haarlets are representing very coarse texture features which are less useful for discrimination of images in iris recognition.

53 citations

Journal Article•10.1016/J.IMAVIS.2009.06.011•
Accurate and speedy computation of image Legendre moments for computer vision applications

[...]

George A. Papakostas1, E. G. Karakasis1, Dimitrios E. Koulouriotis1•
Democritus University of Thrace1
01 Mar 2010-Image and Vision Computing
TL;DR: A novel algorithm that permits the fast and accurate computation of the Legendre image moments is introduced in this paper, based on the block representation of an image and on a new image representation scheme, the Image Slice Representation (ISR) method.

53 citations

Journal Article•10.1049/IET-IPR.2008.0259•
Multifocus image fusion based on redundant wavelet transform

[...]

Xu Li1, Mingyi He1, Michel Roux2•
Northwestern Polytechnical University1, Télécom ParisTech2
29 Jul 2010-Iet Image Processing
TL;DR: The experimental results on several pairs of multifocus images show that the proposed method can achieve good results and exhibit clear advantages over the gradient pyramid transform and discrete wavelet transform techniques.
Abstract: Image fusion is a process of integrating complementary information from multiple images of the same scene such that the resultant image contains a more accurate description of the scene than any of the individual source images. A method for fusion of multifocus images is presented. It combines the traditional pixel-level fusion with some aspects of feature-level fusion. First, multifocus images are decomposed using a redundant wavelet transform (RWT). Then the edge features are extracted to guide coefficient combination. Finally, the fused image is reconstructed by performing the inverse RWT. The experimental results on several pairs of multifocus images show that the proposed method can achieve good results and exhibit clear advantages over the gradient pyramid transform and discrete wavelet transform techniques.

45 citations

Patent•
Image segregation system with method for handling textures

[...]

Neil Alldrin1, Kshitiz Garg1, Andrew Neil Stein1, Kristin Jean Dana1, Bruce Maxwell1, Casey Arthur Smith1, Youngrock Yoon1, Besma Roui Abidi1 •
Vision-Sciences, Inc.1
11 Jan 2010
TL;DR: In this paper, an automated, computerized method is provided for processing an image, which comprises the steps of converting a color band representation of the image to a homogeneous representation of spectral and spatial characteristics of a texture region in the image.
Abstract: In an exemplary embodiment of the present invention, an automated, computerized method is provided for processing an image. According to a feature of the present invention, the method comprises the steps of converting a color band representation of the image to a homogeneous representation of spectral and spatial characteristics of a texture region in the image and utilizing the homogeneous representation of spectral and spatial characteristics of a texture region in the image to identify homogeneous tokens in the image.

43 citations

Book Chapter•10.1007/978-3-642-15555-0_4•
Active mask hierarchies for object detection

[...]

Yuanhao Chen1, Long Zhu2, Alan L. Yuille1•
University of California, Los Angeles1, Massachusetts Institute of Technology2
5 Sep 2010
TL;DR: The resulting system is comparable with the state-of-the-art methods when evaluated on the challenging public PASCAL 2007 and 2009 datasets.
Abstract: This paper presents a new object representation, Active Mask Hierarchies (AMH), for object detection. In this representation, an object is described using a mixture of hierarchical trees where the nodes represent the object and its parts in pyramid form. To account for shape variations at a range of scales, a dictionary of masks with varied shape patterns are attached to the nodes at different layers. The shape masks are "active" in that they enable parts to move with different displacements. The masks in this active hierarchy are associated with histograms of words (HOWs) and oriented gradients (HOGs) to enable rich appearance representation of both structured (eg, cat face) and textured (eg, cat body) image regions. Learning the hierarchical model is a latent SVM problem which can be solved by the incremental concave-convex procedure (iCCCP). The resulting system is comparable with the state-of-the-art methods when evaluated on the challenging public PASCAL 2007 and 2009 datasets.
Proceedings Article•10.1109/CVPR.2010.5539915•
Probabilistic models for supervised dictionary learning

[...]

Xiaochen Lian1, Zhiwei Li1, Changhu Wang2, Bao-Liang Lu1, Lei Zhang2 •
Shanghai Jiao Tong University1, Microsoft2
13 Jun 2010
TL;DR: A probabilistic model for supervised dictionary learning (SDLM) is proposed which seamlessly combines an unsupervised model (a Gaussian Mixture Model) and a supervised model ( a logistic regression model) in a Probabilistic framework and is extended to incorporate spatial information during the dictionary learning process in a spatial pyramid matching like manner.
Abstract: Dictionary generation is a core technique of the bag-of-visual-words (BOV) models when applied to image categorization. Most of previous approaches generate dictionaries by unsupervised clustering techniques, e.g. k-means. However, the features obtained by such kind of dictionaries may not be optimal for image classification. In this paper, we propose a probabilistic model for supervised dictionary learning (SDLM) which seamlessly combines an unsuper-vised model (a Gaussian Mixture Model) and a supervised model (a logistic regression model) in a probabilistic framework. In the model, image category information directly affects the generation of a dictionary. A dictionary obtained by this approach is a trade-off between minimization of distortions of clusters and maximization of discriminative power of image-wise representations, i.e. histogram representations of images. We further extend the model to incorporate spatial information during the dictionary learning process in a spatial pyramid matching like manner. We extensively evaluated the two models on various benchmark dataset and obtained promising results.
Book Chapter•10.1007/978-3-642-15567-3_21•
Recursive coarse-to-fine localization for fast object detection

[...]

Marco Pedersoli, Jordi Gonzàlez, Andrew D. Bagdanov, Juan J. Villanueva
5 Sep 2010
TL;DR: Results show that the Recursive Coarse-to-Fine Localization (RCFL) achieves a 12x speed-up compared to standard sliding windows, and compared with a cascade of multiple resolutions approach the method has slightly better performance in speed and Average-Precision.
Abstract: Cascading techniques are commonly used to speed-up the scan of an image for object detection. However, cascades of detectors are slow to train due to the high number of detectors and corresponding thresholds to learn. Furthermore, they do not use any prior knowledge about the scene structure to decide where to focus the search. To handle these problems, we propose a new way to scan an image, where we couple a recursive coarse-to-fine refinement together with spatial constraints of the object location. For doing that we split an image into a set of uniformly distributed neighborhood regions, and for each of these we apply a local greedy search over feature resolutions. The neighborhood is defined as a scanning region that only one object can occupy. Therefore the best hypothesis is obtained as the location with maximum score and no thresholds are needed. We present an implementation of our method using a pyramid of HOG features and we evaluate it on two standard databases, VOC2007 and INRIA dataset. Results show that the Recursive Coarse-to-Fine Localization (RCFL) achieves a 12x speed-up compared to standard sliding windows. Compared with a cascade of multiple resolutions approach our method has slightly better performance in speed and Average-Precision. Furthermore, in contrast to cascading approach, the speed-up is independent of image conditions, the number of detected objects and clutter.
Patent•
Locality-constrained linear coding systems and methods for image classification

[...]

Jinjun Wang1, Fengjun Lv1, Kai Yu1•
Princeton University1
24 Jun 2010
TL;DR: In this paper, the authors describe methods for classifying an input image by detecting one or more feature points on the input image; extracting one or multiple descriptors from each feature point; applying a codebook to quantize each descriptor and generate code from each descriptor; applying spatial pyramid matching to generate histograms; and concatenating histograms from all sub-regions to generate a final representation of the image for classification.
Abstract: Systems and methods are disclosed for classifying an input image by detecting one or more feature points on the input image; extracting one or more descriptors from each feature point; applying a codebook to quantize each descriptor and generate code from each descriptor; applying spatial pyramid matching to generate histograms; and concatenating histograms from all sub-regions to generate a final representation of the image for classification.
Proceedings Article•10.1109/ICIP.2010.5651862•
Fast scene text localization by learning-based filtering and verification

[...]

Yi-Feng Pan1, Cheng-Lin Liu1, Xinwen Hou1•
Chinese Academy of Sciences1
3 Dec 2010
TL;DR: Experimental results show that the proposed method for fast text localization in natural scene images provides competitive localization performance at high speed.
Abstract: This paper proposes a new method for fast text localization in natural scene images by combining learning-based region filtering and verification in a coarse-to-fine strategy. In each pyramid layer, a boosted region filter is used to extract candidate text regions, which are segmented into candidate text lines by multi-orientation projection analysis. A polynomial classifier with combined features is used to verify patches of candidate text lines for removing non-texts. The remaining text patches over all pyramid layers are grouped into text lines based on their spatial relationships. The text lines are further refined and partitioned into words by connected component analysis. Experimental results show that the proposed method provides competitive localization performance at high speed.
Patent•
Method and system for representing image patches

[...]

Mark A. Ruzon, Raghavan Manmatha, Donald Tanguay
13 Jan 2010
TL;DR: In this paper, a method, system and computer program product for representing an image in the form of a Gaussian pyramid is provided. The image that needs to be represented is represented in the shape of a pyramid which is a scale space representation of the image and includes several pyramid images.
Abstract: A method, system and computer program product for representing an image is provided. The image that needs to be represented is represented in the form of a Gaussian pyramid which is a scale-space representation of the image and includes several pyramid images. The feature points in the pyramid images are identified and a specified number of feature points are selected. The orientations of the selected feature points are obtained by using a set of orientation calculating algorithms. A patch is extracted around the feature point in the pyramid images based on the orientations of the feature point and the sampling factor of the pyramid image. The boundary patches in the pyramid images are extracted by padding the pyramid images with extra pixels. The feature vectors of the extracted patches are defined. These feature vectors are normalized so that the components in the feature vectors are less than a threshold.
Proceedings Article•10.1109/ICPR.2010.879•
Scene Classification Using Spatial Pyramid of Latent Topics

[...]

Emrah Ergul1, Nafiz Arica1•
Naval Academy1
23 Aug 2010
TL;DR: A scene classification method, which combines two popular methods in the literature: Spatial Pyramid Matching (SPM) and probabilistic Latent Semantic Analysis (pLSA) modeling, and it is seen that the proposed method slightly outperforms the others in that particular dataset.
Abstract: We propose a scene classification method, which combines two popular methods in the literature: Spatial Pyramid Matching (SPM) and probabilistic Latent Semantic Analysis (pLSA) modeling. The proposed scheme called Cascaded pLSA performs pLSA in a hierarchical sense after the soft-weighted BoW representation based on dense local features is extracted. We associate spatial layout information by dividing each image into overlapping regions iteratively at different resolution levels and implementing a pLSA model for each region individually. Finally, an image is represented by concatenated topic distributions of each region. In performance evaluation, we compare the proposed method with the most successful methods in the literature, using the popular 15-class-dataset. In the experiments, it is seen that our method slightly outperforms the others in that particular dataset.
Journal Article•10.1109/TCE.2010.5606295•
Low-memory requirement and efficient face recognition system based on DCT pyramid

[...]

Randa Atta1, Mohammad Ghanbari2•
Suez Canal University1, University of Essex2
01 Aug 2010-IEEE Transactions on Consumer Electronics
TL;DR: A face recognition system with low-memory requirement and accurate recognition is presented, based on extraction of features with the DCT pyramid, in contrast to the conventional method of wavelet decomposition.
Abstract: Face recognition (FR) is a challenging issue due to variations in pose, illumination, and expression. In this paper, a face recognition system with low-memory requirement and accurate recognition is presented. It is based on extraction of features with the DCT pyramid, in contrast to the conventional method of wavelet decomposition. The DCT pyramid performed on each face image decomposes it into an approximation subband and the reversed L-shape blocks containing the high frequency coefficients of the DCT pyramid. A set of simple block-based statistical measures is calculated from the extracted DCT pyramid subbands. This set of statistical measures is an efficient way of reducing the dimensionality of the feature vectors. Experimental results on the standard ORL and FERET databases show that the proposed method achieves more accurate face recognition than the wavelet-based methods. Moreover, it outperforms the other well known methods such as PCA and the block-based DCT with the zigzag scanning in terms of accuracy and memory requirement.
Journal Article•10.1155/2010/914921•
Fuzzy morphological polynomial image representation

[...]

Chin-Pan Huang1, Luis F. Chaparro2•
Ming Chuan University1, University of Pittsburgh2
01 Feb 2010-EURASIP Journal on Advances in Signal Processing
TL;DR: A novel signal representation using fuzzy mathematical morphology is developed which provides results analogous to those given by the polynomial transform and is illustrated in data compression and fractal dimension estimation temporal signals and images.
Abstract: A novel signal representation using fuzzy mathematical morphology is developed. We take advantage of the optimum fuzzy fitting and the efficient implementation of morphological operators to extract geometric information from signals. The new representation provides results analogous to those given by the polynomial transform. Geometrical decomposition of a signal is achieved by windowing and applying sequentially fuzzy morphological opening with structuring functions. The resulting representation is made to resemble an orthogonal expansion by constraining the results of opening to equate adapted structuring functions. Properties of the geometric decomposition are considered and used to calculate the adaptation parameters. Our procedure provides an efficient and flexible representation which can be efficiently implemented in parallel. The application of the representation is illustrated in data compression and fractal dimension estimation temporal signals and images.
Journal Article•10.1109/LSP.2009.2032452•
A Novel Template Matching Scheme for Fast Full-Search Boosted by an Integral Image

[...]

Jik-Han Jung1, Hwalsuk Lee1, Je Hee Lee1, Dong-Jo Park1•
KAIST1
01 Jan 2010-IEEE Signal Processing Letters
TL;DR: A new template matching method accelerated by an integral image is proposed that needs less memory than the conventional approach to maintain block sums of candidates and can be easily extended to nonsquare (rectangular) template matching.
Abstract: A new template matching method accelerated by an integral image is proposed. In contrast to the conventional winner-update template matching algorithm, the proposed scheme uses an integral image instead of a block sum pyramid to represent the search area. When an integral image is used, block sums on the lowest level are evaluated very fast. As a result, the speed with which nonbest candidates are rejected is nearly double that of the conventional scheme. Moreover, the proposed scheme needs less memory than the conventional approach to maintain block sums of candidates and can be easily extended to nonsquare (rectangular) template matching.
Journal Article•10.1117/1.3483906•
Two-dimensional images in frequency-time representation: direction images and resolution map

[...]

Artyom M. Grigoryan1, Nan Du1•
University of Texas at San Antonio1
01 Jul 2010-Journal of Electronic Imaging
TL;DR: The concept of the direction image multiresolution is discussed, which is derived as a property of the 2-D discrete Fourier transform, when it splits by 1-D transforms, and the resolution map is introduced, as a result of uniting all direction images into log2 N series.
Abstract: We discuss the concept of the direction image multiresolution, which is derived as a property of the 2-D discrete Fourier transform, when it splits by 1-D transforms. The N×N image, where N is a power of 2, is considered as a unique set of splitting-signals in paired representation, which is the unitary 2-D frequency and 1-D time representation. The number of splitting-signals is 3N−2, and they have different durations, carry the spectral information of the image in disjoint subsets of frequency points, and can be calculated from the projection data along one of 3N/2 angles. The paired representation leads to the image composition by a set of 3N−2 direction images, which defines the directed multiresolution and contains periodic components of the image. We also introduce the concept of the resolution map, as a result of uniting all direction images into log2 N series. In the resolution map, all different periodic components (or structures) of the image are packed into a N×N matrix, which can be used for image processing in enhancement, filtration, and compression
Content based medical image retrieval based on pyramid structure wavelet

[...]

Aliaa A. A. Youssif, Ashraf Darwish, R. A. Mohamed
1 Jan 2010
TL;DR: A new approach to image retrieval based on color, texture, and shape by using pyramid structure wavelet is presented and the receiving operating characteristic curve (ROC) is generated to assess the results.
Abstract: As technology continues to increase the various formats in which medical images are created, transmitted, and analyzed, it has become more necessary to restrict the different ways in which this data is stored and formatted between the conflicting modalities. There is a significant increase in the use of medical images in clinical medicine, disease research, and education. While the literature lists several successful systems for contentbased image retrieval and image management methods, they have been unable to make significant inroads in routine medical informatics. This paper presents a new approach to image retrieval based on color, texture, and shape by using pyramid structure wavelet. The major advantage of such an approach is that little human intervention is required. However, most of these systems only allow a user to query using a complete image with multiple regions and are unable to retrieve similar looking images based on a single region. Experimental results of the query system on different test image databases are given. This paper introduces a comparative study between color, texture, shape and the pyramid structure wavelet technique and generates the receiving operating characteristic curve (ROC) to assess the results. The area under the curve when use color is 0.58, when use shape is 0.68, when use texture 0.74 and when use the wavelet technique is 0.8.
Journal Article•10.11873/J.ISSN.1004-0323.2007.5.622•
An Inshore Ship Detection Method Based on Contour Matching

[...]

Lei Lin, Su Yi
03 Sep 2010-Remote Sensing Technology and Application
TL;DR: A partial Hausdorff distance measurement based on image contour matching method is proposed, which is fit for the inshore ship search and location.
Abstract: Inshore ship detection has significant practical meaning,especially for the target change detection.However,it is difficult to realize the inshore ship detection utilizing the traditional area-based method because of the complex background.A partial Hausdorff distance measurement based on image contour matching method is proposed,which is fit for the inshore ship search and location.The main characteristics of the proposed method are,1) a fast distance transform and pyramid decomposition are used to speedup the Hausdorff distance matching;2) a pyramid is constructed from the original image to avoid the over-sample of contour.Experiments with images of satellite are carried out to validate and analyze the proposed method.
Journal Article•10.5565/REV/ELCVIA.350•
A Performance Evaluation of Exact and Approximate Match Kernels for Object Recognition

[...]

Barbara Caputo1, Luo Jie•
Idiap Research Institute1
03 Feb 2010-Electronic Letters on Computer Vision and Image Analysis
TL;DR: A thorough experimental evaluation of the two methods for solving the correspondence problem via the definition of a kernel function that makes it possible to use local features as input to a support vector machine shows that the exact method performs consistently better than the approximate one.
Abstract: Local features have repeatedly shown their effectiveness for object recognition during the last years, and they have consequently become the preferred descriptor for this type of problems. The solution of the correspondence problem is traditionally approached with exact or approximate techniques. In this paper we are interested in methods that solve the correspondence problem via the definition of a kernel function that makes it possible to use local features as input to a support vector machine. We single out the match kernel, an exact approach, and the pyramid match kernel, that uses instead an approximate strategy. We present a thorough experimental evaluation of the two methods on three different databases. Results show that the exact method performs consistently better than the approximate one, especially for the object identification task, when training on a decreasing number of images. Based on this findings and on the computational cost of each approach, we suggest some criteria for choosing between the two kernels given the application at hand.
Proceedings Article•10.1145/1866158.1866193•
Optimizing continuity in multiscale imagery

[...]

Charles Han1, Hugues Hoppe2•
Columbia University1, Microsoft2
15 Dec 2010
TL;DR: This work presents a scheme that creates a visually smooth mipmap pyramid from stitched imagery at several scales by using a nonlinear operator to inject detail from the fine image into the coarse one while retaining color consistency.
Abstract: Multiscale imagery often combines several sources with differing appearance. For instance, Internet-based maps contain satellite and aerial photography. Zooming within these maps may reveal jarring transitions. We present a scheme that creates a visually smooth mipmap pyramid from stitched imagery at several scales. The scheme involves two new techniques. The first, structure transfer, is a nonlinear operator that combines the detail of one image with the local appearance of another. We use this operator to inject detail from the fine image into the coarse one while retaining color consistency. The improved structural similarity greatly reduces inter-level ghosting artifacts. The second, clipped Laplacian blending, is an efficient construction to minimize blur when creating intermediate levels. It considers the sum of all inter-level image differences within the pyramid. We demonstrate continuous zooming of map imagery from space to ground level.
Journal Article•10.5120/1006-41•
Query by Image Content using Color-Texture Features Extracted from Haar Wavelet Pyramid

[...]

Dr.H.B. Kekre, Sudeep D. Thepade, Akshay Maloo
20 Aug 2010-International Journal of Computer Applications
TL;DR: The results show that precision and recall of Haar Wavelets are better than complete Haar transform based CBIR, which proves that HaarWavelets gives better discrimination capability in image retrieval at higher query execution speed, per higher level Haar wavelets.
Abstract: The paper presents the Wavelet Pyramid based image retrieval techniques [1] using Haar transform. Here content based image retrieval (CBIR) is done using the image feature set extracted from Haar Wavelets applied on the image at various levels of decomposition. Here the database image features are extracted by applying Haar Wavelets on gray plane (average of red, green and blue) and color planes (red, green and blue components). The techniques Gray-Haar Wavelets and Color-Haar Wavelets are tested on image database having 11 categories with total 1000 images. Total 55 queries are fired on the database. The results show that precision and recall of Haar Wavelets are better than complete Haar transform based CBIR, which proves that Haar Wavelets gives better discrimination capability in image retrieval at higher query execution speed, per higher level Haar Wavelets. Color-Haar Wavelets based CBIR have greater precision and recall than Gray-Haar Wavelets based CBIR. The Haar Wavelets level-5 outperforms other Haar Wavelets, because the higher level Haar Wavelets are giving very coarse color-texture features while the lower level are representing very fine color-texture features which are less useful to differentiate the images in image retrieval.
Proceedings Article•10.1109/DICTA.2010.46•
Enhanced Spatial Pyramid Matching Using Log-Polar-Based Image Subdivision and Representation

[...]

Edmond Zhang1, Michael Mayo1•
University of Waikato1
1 Dec 2010
TL;DR: This paper proposes a new method to exploit spatial relationships between image features, based on binned log-polar grids, and shows that this approach improves the results on three diverse datasets over the SPM technique.
Abstract: This paper presents a new model for capturing spatial information for object categorization with bag-of-words (BOW). BOW models have recently become popular for the task of object recognition, owing to their good performance and simplicity. Much work has been proposed over the years to improve the BOW model, where the Spatial Pyramid Matching (SPM) technique is the most notable. We propose a new method to exploit spatial relationships between image features, based on binned log-polar grids. Our model works by partitioning the image into grids of different scales and orientations and computing histogram of local features within each grid. Experimental results show that our approach improves the results on three diverse datasets over the SPM technique.
Proceedings Article•10.1109/AVSS.2010.79•
Learning Dense Optical-Flow Trajectory Patterns for Video Object Extraction

[...]

Wang-Chou Lu1, Yu-Chiang Frank Wang1, Chu-Song Chen1•
Center for Information Technology1
29 Aug 2010
TL;DR: The proposed unsupervised method to address videoobject extraction (VOE) in uncontrolled videos, i.e. videos captured by low-resolution and freely moving cameras, advocates the use of dense optical-flow trajectories (DOTs), which are obtained by propagating the optical flow information at the pixel level.
Abstract: We proposes an unsupervised method to address videoobject extraction (VOE) in uncontrolled videos, i.e. videoscaptured by low-resolution and freely moving cameras. Weadvocate the use of dense optical-flow trajectories (DOTs),which are obtained by propagating the optical flow informationat the pixel level. Therefore, no interest point extractionis required in our framework. To integrate colorand and shape information of moving objects, we groupthe DOTs at the super-pixel level to extract co-motion regions,and use the associated pyramid histogram of orientedgradients (PHOG) descriptors to extract objects of interestacross video frames. Our approach for VOE is easy to implement,and the use of DOTs for both motion segmentationand object tracking is more robust than existing trajectorybasedmethods. Experiments on several video sequencesexhibit the feasibility of our proposed VOE framework.
Proceedings Article•10.1109/ICARCV.2010.5707326•
Ensembles of novel visual keywords descriptors for image categorization

[...]

Azizi Abdullah1, Remco C. Veltkamp2, Marco A. Wiering3•
National University of Malaysia1, Utrecht University2, University of Groningen3
1 Dec 2010
TL;DR: This paper introduces several novel bag of visual keywords methods and compares them with the currently dominating hard bag-of-features (HBOF) approach that uses a hard assignment scheme to compute cluster frequencies.
Abstract: Object recognition systems need effective image descriptors to obtain good performance levels. Currently, the most widely used image descriptor is the SIFT descriptor that computes histograms of orientation gradients around points in an image. A possible problem of this approach is that the number of features becomes very large when a dense grid is used where the histograms are computed and combined for many different points. The current dominating solution to this problem is to use a clustering method to create a visual codebook that is exploited by an appearance based descriptor to create a histogram of visual keywords present in an image. In this paper we introduce several novel bag of visual keywords methods and compare them with the currently dominating hard bag-of-features (HBOF) approach that uses a hard assignment scheme to compute cluster frequencies. Furthermore, we combine all descriptors with a spatial pyramid and two ensemble classifiers. Experimental results on 10 and 101 classes of the Caltech-101 object database show that our novel methods significantly outperform the traditional HBOF approach and that our ensemble methods obtain state-of-the-art performance levels.
Patent•
Image processing device, image data generation device, image processing method, image data generation method, and data structure of image file

[...]

Hiroyuki Segawa1, Noriaki Shinoyama1, Akio Ohba1, Tetsugo Inada1•
Sony Computer Entertainment1
5 Nov 2010
TL;DR: A parallax representation unit in a displayed image processing unit uses a height map containing information on a height of an object for each pixel to represent different views caused by the height of the object.
Abstract: A parallax representation unit in a displayed image processing unit uses a height map containing information on a height of an object for each pixel to represent different views caused by the height of the object. A color representation unit uses, for example, texture coordinate values derived by the parallax representation unit to render the image, shifting the pixel defined in the color map. The color representation unit uses the normal map that maintains normals to the surface of the object for each pixel to change the way that light impinges on the surface and represent the roughness accordingly. A shadow representation unit uses a horizon map, which maintains information for each pixel to indicate whether a shadow is cast depending on the angle relative to the light source, so as to shadow the image rendered by the color representation unit.
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve