Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Pyramid (image processing)
  4. 2011
  1. Home
  2. Topics
  3. Pyramid (image processing)
  4. 2011
Showing papers on "Pyramid (image processing) published in 2011"
Journal Article•10.1016/J.FIRESAF.2011.01.001•
Video-based smoke detection with histogram sequence of LBP and LBPV pyramids

[...]

Feiniu Yuan1•
Jiangxi University of Finance and Economics1
01 Apr 2011-Fire Safety Journal
TL;DR: Wang et al. as mentioned in this paper proposed a video-based smoke detection method by using a histogram sequence of pyramids, which involves four steps: first, through multi-scale analysis, a 3-level image pyramid is constructed.

237 citations

Journal Article•10.1016/J.PATCOG.2011.03.029•
PLBP: An effective local binary patterns texture descriptor with pyramid representation

[...]

Xueming Qian1, Xian-Sheng Hua2, Ping Chen1, Liangjun Ke1•
Xi'an Jiaotong University1, Microsoft2
01 Oct 2011-Pattern Recognition
TL;DR: The conventional local binary pattern is extended to pyramid transform domain (PLBP) by cascading the LBP information of hierarchical spatial pyramids, PLBP descriptors take texture resolution variations into account and show their effectiveness for texture representation.

218 citations

Proceedings Article•
Learning person-object interactions for action recognition in still images

[...]

Vincent Delaitre1, Josef Sivic1, Ivan Laptev1•
École Normale Supérieure1
12 Dec 2011
TL;DR: A discriminatively trained model of person-object interactions for recognizing common human actions in still images that bypasses the difficult problem of estimating the complete human body pose configuration is investigated.
Abstract: We investigate a discriminatively trained model of person-object interactions for recognizing common human actions in still images. We build on the locally order-less spatial pyramid bag-of-features model, which was shown to perform extremely well on a range of object, scene and human action recognition tasks. We introduce three principal contributions. First, we replace the standard quantized local HOG/SIFT features with stronger discriminatively trained body part and object detectors. Second, we introduce new person-object interaction features based on spatial co-occurrences of individual body parts and objects. Third, we address the combinatorial problem of a large number of possible interaction pairs and propose a discriminative selection procedure using a linear support vector machine (SVM) with a sparsity inducing regularizer. Learning of action-specific body part and object interactions bypasses the difficult problem of estimating the complete human body pose configuration. Benefits of the proposed model are shown on human action recognition in consumer photographs, outperforming the strong bag-of-features baseline.

190 citations

Proceedings Article•10.1109/WISP.2011.6051718•
A Multi-Channel Representation for images on quantum computers using the RGBα color space

[...]

Bo Sun1, Phuc Q. Le1, Abdullah M. Iliyasu1, Fei Yan1, J. Adrian Garcia1, Fangyan Dong1, Kaoru Hirota1 •
Tokyo Institute of Technology1
20 Oct 2011
TL;DR: The simulation experiment results on classical computer show that the channel information of R G B (color) and α (transparency) can be carried easily on quantum computer by employing three qubits to represent the color space, and indicate that this MCRQI is very flexible to realize some classic-like operations.
Abstract: A Multi-Channel Representation for Quantum Image (MCRQI) is proposed to facilitate the further image processing tasks based on the Flexible Representation for Quantum Image (FRQI). Channel Swapping Operation, One Channel Operation, are proposed as basic image processing operations on MCRQI representation. The simulation experiment results on classical computer show that the channel information of R G B (color) and α (transparency) can be carried easily on quantum computer by employing three qubits to represent the color space, and also indicate that this MCRQI is very flexible to realize some classic-like operations. The MCRQI provides a foundation not only express image in RGBα color space, but also to explore theoretical and practical aspects of image processing on quantum computer.

130 citations

Proceedings Article•10.1109/CVPR.2011.5995691•
Discriminative spatial pyramid

[...]

Tatsuya Harada1, Yoshitaka Ushiku1, Yuya Yamashita1, Yasuo Kuniyoshi1•
University of Tokyo1
20 Jun 2011
TL;DR: Discriminative SPR is a new representation that forms the image feature as a weighted sum of semi-local features over all pyramid levels, which is compact and preserves high discriminative power, even in low dimension.
Abstract: Spatial Pyramid Representation (SPR) is a widely used method for embedding both global and local spatial information into a feature, and it shows good performance in terms of generic image recognition. In SPR, the image is divided into a sequence of increasingly finer grids on each pyramid level. Features are extracted from all of the grid cells and are concatenated to form one huge feature vector. As a result, expensive computational costs are required for both learning and testing. Moreover, because the strategy for partitioning the image at each pyramid level is designed by hand, there is weak theoretical evidence of the appropriate partitioning strategy for good categorization. In this paper, we propose discriminative SPR, which is a new representation that forms the image feature as a weighted sum of semi-local features over all pyramid levels. The weights are automatically selected to maximize a discriminative power. The resulting feature is compact and preserves high discriminative power, even in low dimension. Furthermore, the discriminative SPR can suggest the distinctive cells and the pyramid levels simultaneously by observing the optimal weights generated from the fine grid cells.

101 citations

Journal Article•10.1007/S11263-010-0342-X•
Fast PRISM: Branch and Bound Hough Transform for Object Class Detection

[...]

Alain Lehmann1, Bastian Leibe2, Luc Van Gool1•
ETH Zurich1, RWTH Aachen University2
01 Sep 2011-International Journal of Computer Vision
TL;DR: This paper addresses the task of efficient object class detection by means of the Hough transform by demonstrating PRISM’s flexibility by two complementary implementations: a generatively trained Gaussian Mixture Model as well as a discriminatively trained histogram approach.
Abstract: This paper addresses the task of efficient object class detection by means of the Hough transform. This approach has been made popular by the Implicit Shape Model (ISM) and has been adopted many times. Although ISM exhibits robust detection performance, its probabilistic formulation is unsatisfactory. The PRincipled Implicit Shape Model (PRISM) overcomes these problems by interpreting Hough voting as a dual implementation of linear sliding-window detection. It thereby gives a sound justification to the voting procedure and imposes minimal constraints. We demonstrate PRISM's flexibility by two complementary implementations: a generatively trained Gaussian Mixture Model as well as a discriminatively trained histogram approach. Both systems achieve state-of-the-art performance. Detections are found by gradient-based or branch and bound search, respectively. The latter greatly benefits from PRISM's feature-centric view. It thereby avoids the unfavourable memory trade-off and any on-line pre-processing of the original Efficient Subwindow Search (ESS). Moreover, our approach takes account of the features' scale value while ESS does not. Finally, we show how to avoid soft-matching and spatial pyramid descriptors during detection without losing their positive effect. This makes algorithms simpler and faster. Both are possible if the object model is properly regularised and we discuss a modification of SVMs which allows for doing so.

80 citations

Pyramid Methods in GPU-Based Image Processing

[...]

Magnus Strengert, Martin Kraus, and Thomas Ertl
1 Jan 2011
TL;DR: This work shows that modern GPUs allow for pyramid methods based on bilinear texture interpolation for high-performance image processing and presents three examples: zooming with biquadratic B-spline filtering, efficient image blurring of arbitrary blur width, and smooth interpolation of scattered pixel data.
Abstract: There are numerous applications and variants of pyramid methods in digital image processing. Many of them feature a linear time complexity in the number of pixels; thus, they are particularly well suited for real-time image processing. In this work, we show that modern GPUs allow us to implement pyramid methods based on bilinear texture interpolation for high-performance image processing and present three examples: zooming with biquadratic B-spline filtering, efficient image blurring of arbitrary blur width, and smooth interpolation of scattered pixel data. In comparison with published techniques for GPU-based image processing, we achieve considerable performance improvements compared to published filtering techniques and improvements of image quality compared to bilinear interpolation.

63 citations

Patent•
Method of evaluating the horizontal speed of a drone, in particular a drone capable of performing hovering flight under autopilot

[...]

Thomas Derbanne
8 Jun 2011
TL;DR: In this article, an iterative algorithm of the optical flow type is applied to a multiresolution representation of the pyramid of images type modeling a given picked-up image of the scene at different, successively decreasing resolutions.
Abstract: The method operates by estimating the differential movement of the scene picked up by a vertically-oriented camera. Estimation includes periodically and continuously updating a multiresolution representation of the pyramid of images type modeling a given picked-up image of the scene at different, successively-decreasing resolutions. For each new picked-up image, an iterative algorithm of the optical flow type is applied to said representation. The method also provides responding to the data produced by the optical-flow algorithm to obtain at least one texturing parameter representative of the level of microcontrasts in the picked-up scene and obtaining an approximation of the speed, to which parameters a battery of predetermined criteria are subsequently applied. If the battery of criteria is satisfied, then the system switches from the optical-flow algorithm to an algorithm of the corner detector type.

48 citations

Journal Article•10.1016/J.PATREC.2011.06.002•
Similarity-based multimodality image fusion with shiftable complex directional pyramid

[...]

Qiang Zhang1, Long Wang2, Huijuan Li1, Zhaokun Ma1•
Xidian University1, Peking University2
01 Oct 2011-Pattern Recognition Letters
TL;DR: A novel fusion algorithm based on the shiftable complex directional pyramid transform (SCDPT) and the structural similarity (SSIM) index is proposed, which can better deal with 'redundant' and 'complementary' information of source images.

42 citations

Proceedings Article•10.1145/1980022.1980038•
Iris recognition using texture features extracted from Walshlet pyramid

[...]

H. B. Kekre1, Sudeep D. Thepade1, J. Jain1, N. Agrawal1•
Narsee Monjee Institute of Management Studies1
25 Feb 2011
TL;DR: The results show that Walshlet at level-5 outperforms other Walshlets, because the higher level Walshlets are giving very fine texture features while the lower level Walshlet are representing very coarse texture features which are less useful for discrimination of images in iris recognition.
Abstract: Iris recognition has been a fast growing, challenging and interesting area in real-time applications. A large number of iris recognition algorithms have been developed for decades. The paper presents novel Walshlet Pyramid based iris recognition technique. Here iris recognition is done using the image feature set extracted from Walsh Wavelets at various levels of decomposition. Analysis was performed of the proposed method, consisting of the False Acceptance Rate and the Genuine Acceptance Rate. The proposed technique is tested on an iris image database having 384 images. The results show that Walshlet at level-5 outperforms other Walshlets, because the higher level Walshlets are giving very fine texture features while the lower level Walshlets are representing very coarse texture features which are less useful for discrimination of images in iris recognition.

40 citations

Proceedings Article•10.1145/2024676.2024686•
Image and video abstraction by multi-scale anisotropic Kuwahara filtering

[...]

Jan Eric Kyprianidis1•
Hasso Plattner Institute1
5 Aug 2011
TL;DR: Two limitations of the anisotropic Kuwahara filter are addressed and it is shown that by adding thresholding to the weighting term computation of the sectors, artifacts are avoided and smooth results in noise-corrupted regions are achieved.
Abstract: The anisotropic Kuwahara filter is an edge-preserving filter that is especially useful for creating stylized abstractions from images or videos. It is based on a generalization of the Kuwahara filter that is adapted to the local structure of image features. In this work, two limitations of the anisotropic Kuwahara filter are addressed. First, it is shown that by adding thresholding to the weighting term computation of the sectors, artifacts are avoided and smooth results in noise-corrupted regions are achieved. Second, a multi-scale computation scheme is proposed that simultaneously propagates local orientation estimates and filtering results up a low-pass filtered pyramid. This allows for a strong abstraction effect and avoids artifacts in large low-contrast regions. The propagation is controlled by the local variances and anisotropies that are derived during the computation without extra overhead, resulting in a highly efficient scheme that is particularly suitable for real-time processing on a GPU.
Proceedings Article•10.1109/ICSMC.2011.6084045•
Human action recognition based on Pyramid Histogram of Oriented Gradients

[...]

Jin Wang1, Ping Liu2, Mary F.H. She1, Abbas Z. Kouzani1, Saeid Nahavandi1 •
Deakin University1, University of Portsmouth2
1 Jan 2011
TL;DR: This paper employs Pyramid Histogram of Orientation Gradient (PHOG) to characterize human figures for action recognition and demonstrates the effectiveness and robustness of the method with respect to various unconstrained conditions and viewpoints.
Abstract: Human action recognition has been attracted lots of interest from computer vision researchers due to its various promising applications. In this paper, we employ Pyramid Histogram of Orientation Gradient (PHOG) to characterize human figures for action recognition. Comparing to silhouette-based features, the PHOG descriptor does not require extraction of human silhouettes or contours. Two state-space models, i.e., Hidden Markov Model (HMM) and Conditional Random Field (CRF), are adopted to model the dynamic human movement. The proposed PHOG descriptor and the state-space models with respect to different parameters are tested using a standard dataset. We also testify the robustness of the method with respect to various unconstrained conditions and viewpoints. Promising experimental result demonstrates the effectiveness and robustness of our proposed method.
Patent•
Method for assessing the horizontal speed of a drone, particularly of a drone capable of hovering on automatic pilot

[...]

Thomas Derbanne
8 Jun 2011
TL;DR: In this article, a multiresolution representation of pyramid image type models of scene at different successivelydecreasing resolutions is used to estimate differential movement of scene from one image to next.
Abstract: The method involves updating multiresolution representation of pyramid image type models of scene at different successively-decreasing resolutions. The texturing parameter representative of level of microcontrasts is obtained in picked-up scene. An approximation of the horizontal translation speed of drone (10) is obtained. A battery of predetermined criteria is applied to texturing parameter and to speed approximation. The optical-flow algorithm is switched to algorithm of corner detector type so as to estimate differential movement of scene from one image to next.
Patent•
Automated macular pathology diagnosis in threedimensional (3d) spectral domain optical coherence tomography (sd-oct) images

[...]

Hiroshi Ishikawa1, Gadi Wollstein1, Joel S. Schuman1, Yu-Ying Liu1, James M. Rehg1, Mei Chen1 •
University of Pittsburgh1
11 Nov 2011-Investigative Ophthalmology & Visual Science
TL;DR: In this article, a 2-dimensional slice of the image can be aligned to produce an approximately horizontal image of the retina and an edge map based at least in part on the aligned slice, and at least one global representation can be determined based on a multi-scale spatial division on the slice and/or edge map.
Abstract: Systems and methods of analyzing an optical coherence tomography image of a retina are discussed. A 2-dimensional slice of the image can be aligned to produce an approximately horizontal image of the retina and an edge map based at least in part on the aligned slice. Also, at least one global representation can be determined based on a (multi-scale) spatial division, such as multi-scale spatial pyramid, on the slice and/or edge map. Creating the local features is based on the specified cell structure of the global representation. The local features can be constructed based on local binary pattern (LBP)-based features. Additionally, a slice can be categorized into one or more categories via one or more classifiers (e.g., support vector machines). Each category can be associated with at least one ocular pathology, and classifying can be based on the constructed global descriptors, which can include the LBP-based local descriptors.
Journal Article•10.1016/J.COMPELECENG.2010.09.004•
Image denoising using bilateral filter and Gaussian scale mixtures in shiftable complex directional pyramid domain

[...]

Hong-Ying Yang1, Xiang-Yang Wang1, Tian-Xiang Qu1, Zhong-Kai Fu1•
Liaoning Normal University1
01 Sep 2011-Computers & Electrical Engineering
TL;DR: The proposed method for removing noise from digital images, based on bilateral filter and Gaussian scale mixtures in shiftable complex directional pyramid domain, can preserve edges very well while removing noise.
Proceedings Article•10.1109/ICME.2011.6011971•
Improved combination of LBP and sparse representation based classification (SRC) for face recognition

[...]

Rui Min1, Jean-Luc Dugelay1•
Institut Eurécom1
11 Jul 2011
TL;DR: A novel face recognition algorithm of combining LBP with SRC is proposed; in which the dimensionality problem is resolved by divide-and-conquer and the discriminative power is strengthen via its pyramid architecture.
Abstract: Recently, local binary patterns (LBP) based descriptors and sparse representation based classification (SRC) become both eminent techniques in face recognition. Preliminary techniques of combining LBP and SRC have been proposed in the literature. However, the state-of-art method suffers from the “curse of dimensionality” for real world scenarios. In this paper, a novel face recognition algorithm of combining LBP with SRC is proposed; in which the dimensionality problem is resolved by divide-and-conquer and the discriminative power is strengthen via its pyramid architecture. The proposed face recognition method is evaluated on AR Face Database and yields very impressive results.
Proceedings Article•10.1145/2072545.2072547•
Hierarchical spatial matching for medical image retrieval

[...]

Yang Song1, Weidong Cai1, Dagan Feng1•
University of Sydney1
29 Nov 2011
TL;DR: This work aims to effectively extract and represent the spatial context of pathological tissues, and design a novel hierarchical spatial matching (HSM) method based on the spatial pyramid matching.
Abstract: Content-based medical image retrieval is likely becoming an important tool to provide valuable information to assist physician to make critical diagnosis decisions. While most existing works perform the retrieval based on low-level visual features, the pathological spatial context, which is critical for analysis of the disease characteristics, has been less studied. We thus aim to effectively extract and represent the spatial context of pathological tissues, and design a novel hierarchical spatial matching (HSM) method based on the spatial pyramid matching. Our method is able to (1) handle the translation variations of the main pathological object; (2) describe the spatial information surrounding the pathological object in an adaptive scale; and (3) compute image similarities with an optimally weighted distance function. The proposed method shows better retrieval performance comparing to the other widely used techniques.
Patent•
Alignment of digital images and local motion detection for high dynamic range (HDR) imaging

[...]

Binu K. Mathew1•
Apple Inc.1
12 Jan 2011
TL;DR: In this paper, image pyramids are generated using reference and source images and the alignment vector for each level can be aggregated to determine a final alignment vector which can be used to shift the source image.
Abstract: Image alignment operations particular suited for High Dynamic Range (HDR) image generation are described. Image pyramids may be generated using reference and source images. Difference bitmaps, based on a number of pixel shift combinations in the x and y directions, can be divided into tiles and analyzed and, for each pyramid level, an optimal shift direction determined. The tiles can then be pruned using a threshold such that only those tiles contributing up to the threshold are projected to a subsequent pyramid level. The alignment vector for each level can be aggregated to determine a final alignment vector which can be used to shift the source image. This process may be repeated for another source image, and the two source images and reference image, once aligned, may be merged to generate an HDR image.
Proceedings Article•10.1109/ICECTECH.2011.5941861•
Content based image retrieval using textural features based on pyramid-structure wavelet transform

[...]

Lidiya Xavier1, I. Thusnavis Bella Mary1, W Newton David Raj1•
Karunya University1
8 Apr 2011
TL;DR: Content based image retrieval method is used as diagnosis aid in medical fields based on Texture features and the precision rate obtained is about 60% for DRD images.
Abstract: Image Retrieval system is an effective and efficient tool for managing large image databases. A content based image retrieval system allows the user to present a query image in order to retrieve images stored in the database according to their similarity to the query image. In this paper content based image retrieval method is used as diagnosis aid in medical fields. The main objective of this paper is to evaluate the retrieval system based on Texture features. The texture features are extracted by using pyramidal wavelet transform. The major advantage of such an approach is that little human intervention is required. The method is evaluated on Diabetic Retinopathy Database (DRD). Here the precision rate obtained is about 60% for DRD images.
Journal Article•10.1109/TIP.2010.2090534•
Size-Controllable Region-of-Interest in Scalable Image Representation

[...]

Chee Sun Won1, Shahram Shirani2•
Dongguk University1, McMaster University2
01 May 2011-IEEE Transactions on Image Processing
TL;DR: A scalable image representation with the ROI functionality in the spatial domain is proposed, which allows us to generate a hierarchy of images with arbitrary sizes.
Abstract: Differentiating region-of-interest (ROI) from non-ROI in an image in terms of relative size as well as fidelity becomes an important functionality for future visual communication environment with a variety of display devices. In this paper, we propose a scalable image representation with the ROI functionality in the spatial domain, which allows us to generate a hierarchy of images with arbitrary sizes. The ROI functionality of our scalable representation is a result of a nonuniform grid transformation in the spatial domain, where only the center of ROI and an expansion parameter are to be known. Our grid transformation guarantees no loss of information within the area of ROI.
Book Chapter•10.1007/978-3-642-23535-1_27•
Actions in stillweb images: visualization, detection and retrieval

[...]

Piji Li1, Jun Ma1, Shuai Gao1•
Shandong University1
14 Sep 2011
TL;DR: A framework for human action retrieval in still web images by verb queries, for instance "phoning" is described, which builds a group of visual discriminative instances for each action class, called "Exemplarlets", and employs Multiple Kernel Learning to learn an optimal combination of histogram intersection kernels.
Abstract: We describe a framework for human action retrieval in still web images by verb queries, for instance "phoning". Firstly, we build a group of visual discriminative instances for each action class, called "Exemplarlets". Thereafter we employ Multiple Kernel Learning (MKL) to learn an optimal combination of histogram intersection kernels, each of which captures a state-of-the-art feature channel. Our features include the distribution of edges, dense visual words and feature descriptors at different levels of spatial pyramid. For a new image we can detect the hot-region using a sliding-window detector learnt via MKL. The hotregion can imply latent actions in the image. After the hot-region has been detected, we build a inverted index in the visual search path, which we called Visual Inverted Index (VII). Finally, fusing the visual search path and the text search path, we can get the accurate results either relevant to text or to visual information. We show both the detection and retrieval results on our newly collected dataset of six actions as well as demonstrate improved performance over existing methods.
Journal Article•10.1016/J.PATCOG.2010.10.025•
Multi-scale 2D tracking of articulated objects using hierarchical spring systems

[...]

Nicole M. Artner1, Adrian Ion1, Walter G. Kropatsch1•
Vienna University of Technology1
01 Apr 2011-Pattern Recognition
TL;DR: This paper presents a flexible framework to build a target-specific, part-based representation for arbitrary articulated or rigid objects by employing a hierarchical, iterative optimization process on the proposed representation of structure and appearance.
Proceedings Article•10.1109/ICCVW.2011.6130377•
Fast and accurate environment modeling using three-dimensional occupancy grids

[...]

Katrin Pirker, Matthias Rüther, Horst Bischof, Gerald Schweighofer1•
Joanneum Research1
1 Nov 2011
TL;DR: This work presents an approach which reconstructs 3D environments using a probabilistic occupancy grid in real-time using a weighted interpolation scheme between neighboring pyramid layers, which speeds up computation time and boosts accuracy.
Abstract: Building a dense and accurate environment model out of range image data faces problems like sensor noise, extensive memory consumption or computation time. We present an approach which reconstructs 3D environments using a probabilistic occupancy grid in real-time. Operating on depth image pyramids speeds up computation time, whereas a weighted interpolation scheme between neighboring pyramid layers boosts accuracy. In our experiments we compare our method with a state-of-the-art mapping procedure. Our results demonstrate that we achieve better results. Finally, we present its viability by mapping a large indoor environment.
Patent•
Robust method for extracting two-dimensional code area in image

[...]

Junfeng Wang, Gao Lin, Chen Yi, Tang Peng, Tao Du 
23 Nov 2011
TL;DR: In this article, a robust method for extracting a two-dimensional code area in an image is disclosed, which comprises the following steps of: first, building a multi-scale Gaussian image pyramid according to an original image; secondly, partitioning each layer in the image pyramid; binarizing all the image blocks, respectively, in order to obtain the binarization result of each layer of image; fourthly, fusing the binary images under a plurality of scales to obtain an image partition result.
Abstract: A robust method for extracting a two-dimensional code area in an image is disclosed, which comprises the following steps of: firstly, building a multi-scale Gaussian image pyramid according to an original image; secondly, partitioning each layer in the image pyramid; thirdly, binarizing all the image blocks, respectively, in order to obtain the binarization result of each layer of image; fourthly, fusing the binary images under a plurality of scales to obtain an image partition result; and finally, searching for the two-dimensional code area in unit of the image block and extracting the two-dimensional code area in the image by analyzing and calculating the convex hull of a characteristic point set through a connecting body. The method disclosed by the invention is capable of handling the common complex cases such as uneven illumination, background interference and the like, and has good robustness; as a search policy from coarse to fine under a plurality of scales is utilized, the method is simple and quick, and thereby is capable of taking both of the instantaneity and accuracy of handling into account.
Proceedings Article•10.1109/RADAR.2011.5960546•
SAR target classification using sparse representations and spatial pyramids

[...]

Peter Knee1, Jayaraman J. Thiagarajan1, Karthikeyan Natesan Ramamurthy1, Andreas Spanias1•
Arizona State University1
23 May 2011
TL;DR: Results using a linear SVM for classification along with SIFT, FFT-magnitude and DCT-based local feature descriptors indicate that the use of a single element from the dictionary to describe the local features is sufficient for accurate target classification.
Abstract: We consider the problem of automatically classifying targets in synthetic aperture radar (SAR) imagery using image partitioning and sparse representation based feature vector generation. Specifically, we extend the spatial pyramid approach, in which the image is partitioned into increasingly fine sub-regions, by using a sparse representation to describe the local features in each sub-region. These feature descriptors are generated by identifying those dictionary elements, created via k-means clustering, that best approximate the local features for each sub-region. By systematically combining the results at each pyramid level, classification ability is facilitated by approximate geometric matching. Results using a linear SVM for classification along with SIFT, FFT-magnitude and DCT-based local feature descriptors indicate that the use of a single element from the dictionary to describe the local features is sufficient for accurate target classification. Continuing work both in feature extraction and classification will be discussed, with emphasis placed on the need for classification amid heavy target occlusion.
Book Chapter•10.1007/978-3-642-24031-7_60•
Evaluating feature combination in object classification

[...]

Jian Hou1, Bo-Ping Zhang1, Nai-Ming Qi2, Yong Yang2•
Xuchang University1, Harbin Institute of Technology2
26 Sep 2011
TL;DR: This paper studied the combination of various popular descriptors, kernels and spatial pyramid levels through extensive experiments on four datasets of diverse object types to provide some empirical guidelines on designing experimental setups and combination algorithms in object classification.
Abstract: Feature combination is used in object classification to combine the strength of multiple complementary features and yield a more powerful feature. While some work can be found in literature to calculate the weights of features, the selection of features used in combination is rarely touched. Different researchers usually use different sets of features in combination and obtain different results. It's not clear to which degree the superior combination results should be attributed to the combination methods and not the carefully selected feature sets. In this paper we evaluate the impact of various feature-related factors on feature combination performance. Specifically, we studied the combination of various popular descriptors, kernels and spatial pyramid levels through extensive experiments on four datasets of diverse object types. As a result, we provide some empirical guidelines on designing experimental setups and combination algorithms in object classification.
Patent•
Two-dimensional image sequence based three-dimensional reconstruction method of target

[...]

Ligang Wu, Xutao Li, Chenghu Yang, Hongyan Zhao
25 May 2011
TL;DR: In this article, a two-dimensional image sequence based three-dimensional reconstruction method of a target is proposed, which is based on a scale invariant feature transform (SIFT) algorithm.
Abstract: The invention discloses a two-dimensional image sequence based three-dimensional reconstruction method of a target, and relates to the three-dimensional reconstruction method of the target, which solves the problem that in the traditional image-based three-dimensional reconstruction method, the reconstruction precision is low due to more points needing to be reconstructed and large calculation quantity The three-dimensional reconstruction method comprises the following steps of: using a camera to obtain a two-dimensional image sequence of the target, calculating and matching each image through a scale invariant feature transform (SIFT) algorithm, and calculating the geometric relationship between images; carrying out the corner detection of each image in a Gaussian scale pyramid generated in the realizing process of the SIFT algorithm, and obtaining the multi-scale corner features of the images; taking the obtained SIFT matching point as a center, searching a corner corresponding to each image in a limited range of a restrained distance, and matching the corners obtained by each image to obtain the matched corner; and realizing the three-dimensional reconstruction of the target by carrying out the three-dimensional reconstruction of the matched corner according to a projection matrix of a camera The two-dimensional image sequence based three-dimensional reconstruction method is applied to the three-dimensional reconstruction of the target
Proceedings Article•10.1109/CMSP.2011.37•
Pyramid-Based Multi-scale LBP Features for Face Recognition

[...]

Wei Wang1, Weimin Chen1, Dongxia Xu1•
Kunming University1
14 May 2011
TL;DR: Experimental results on ORL and FERET face databases show that the proposed LBP representation is highly efficient with good performance in face recognition and is robust to illumination, facial expression and position variation.
Abstract: To efficiently extract local and global features in face description and recognition, a pyramid-based multi-scale LBP approach is proposed. Firstly, the face image pyramid is constructed through multi-scale analysis. Then the LBP operator is applied to each level of the image pyramid to extract facial features under various scales. Finally, all the extracted features are concatenated into an enhanced feature vector which is used as the face descriptor. Experimental results on ORL and FERET face databases show that the proposed LBP representation is highly efficient with good performance in face recognition and is robust to illumination, facial expression and position variation.
Patent•
POS auxiliary aviation image matching method

[...]

Shunping Ji, Xiuxiao Yuan, Zhenli Wu
24 Aug 2011
TL;DR: Zhang et al. as discussed by the authors proposed a POS-aided method for matching aerial images, which consists of utilizing an exterior orientation element obtained by the POS to construct a homonymous nucleofilament constraint equation and predict the initial parallax of an image.
Abstract: The invention discloses a POS-aided method for matching aerial images, which comprises the following steps: firstly, utilizing an exterior orientation element obtained by the POS to construct a homonymous nucleofilament constraint equation and predict the initial parallax of an image; then, establishing an image pyramid according to the initial parallax and an approximate one-dimensional image correlation which carries out nucleofilament constraint layer by layer on the image of the pyramid; and finally adopting the matching of least-square images to confirm homonymous image points and pick mismatched points, thereby obtaining the homonymous image points of the images to be matched. The invention which adopts the POS-aided image matching method to automatically measure image points has the advantages that not only the application potential of the POS can be fully developed; but also the matching rate and the matching efficiency of automatically rotating points can be improved; and theproblems that the rotating points of the images are so difficult to be matched that the rotating points are required to be measured manually and interactively are solved, for example, the rotational angles of certain images are too large, the image texture is not obvious, and the topographic relief is bigger.
Patent•
Chinese painting image identifying method based on local semantic concept

[...]

Hong Bao, Songhe Feng, Nan Zhang, Haitao Lou, Difei Wang, Weiguo Pan 
11 May 2011
TL;DR: Wang et al. as mentioned in this paper proposed a Chinese painting image identification method based on a local semantic concept, which comprises the following steps: 1) collecting the image of Chinese painting works to be identified by using a scanning device and storing the image into a computer.
Abstract: The invention relates to a Chinese painting image identifying method based on a local semantic concept, which comprises the following steps: 1) collecting the image of a Chinese painting works to be identified by using a scanning device and storing the image into a computer; 2) dividing the collected image of the Chinese painting works into a training sample set and a testing sample set by using a random withdrawal device; 3) respectively extracting an obvious area image from the image of the Chinese painting works from the training sample set and the testing sample set by using a visual attention model; 4) establishing an image word-packaging model of the Chinese painting works for the image of the Chinese painting works and corresponding obvious area image in the training sample set; 5) generating two corresponding spatial pyramid feature column diagrams according to the image word-packaging model and a spatial pyramid model; 6) confusing the two spatial pyramid feature column diagrams generated in step 5) by using a serial confusing method; and 7) identifying the Chinese painting image to be identified in the testing sample set by using more than one classifying method of clustering method, K nearest neighbor method, neural network method and support vector machine method, and outputting an identifying result in the manner of identifying accuracy rate and confusion matrix.
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve