Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Pyramid (image processing)
  4. 2001
  1. Home
  2. Topics
  3. Pyramid (image processing)
  4. 2001
Showing papers on "Pyramid (image processing) published in 2001"
Proceedings Article•10.1109/CVPR.2001.990645•
Compact representation of bidirectional texture functions

[...]

Oana G. Cula1, Kristin J. Dana1•
Rutgers University1
1 Dec 2001
TL;DR: A representation is constructed which captures the underlying statistical distribution of features in the image texture as well as the variations in this distribution with viewing and illumination direction and is a compact representation and a recognition method where a single novel image of unknown viewing and illuminated direction can be classified efficiently.
Abstract: A bidirectional texture function (BTF) describes image texture as it varies with viewing and illumination direction. Many real world surfaces such as skin, fur, gravel, etc. exhibit fine-scale geometric surface detail. Accordingly, variations in appearance with viewing and illumination direction may be quite complex due to local foreshortening, masking and shadowing. Representations of surface texture that support robust recognition must account for these effects. We construct a representation which captures the underlying statistical distribution of features in the image texture as well as the variations in this distribution with viewing and illumination direction. The representation combines clustering to learn characteristic image features and principle components analysis to reduce the space of feature histograms. This representation is based on a core image set as determined by a quantitative evaluation of importance of individual images in the overall representation. The result is a compact representation and a recognition method where a single novel image of unknown viewing and illumination direction can be classified efficiently. The CUReT (Columbia-Utrecht reflectance and texture) database is used as a test set for evaluation of these methods.

227 citations

Proceedings Article•10.1109/ICIP.2001.958075•
Pyramidal directional filter banks and curvelets

[...]

Minh N. Do, Martin Vetterli1•
École Polytechnique Fédérale de Lausanne1
7 Oct 2001
TL;DR: A flexible multiscale and directional representation for images is proposed that combines directional filter banks with the Laplacian pyramid to provide a sparse representation for two-dimensional piecewise smooth signals resembling images.
Abstract: A flexible multiscale and directional representation for images is proposed. The scheme combines directional filter banks with the Laplacian pyramid to provide a sparse representation for two-dimensional piecewise smooth signals resembling images. The underlying expansion is a frame and can be designed to be a tight frame. Pyramidal directional filter banks provide an effective method to implement the digital curvelet transform. The regularity issue of the iterated filters in the directional filter bank is examined.

185 citations

Book•10.1007/978-1-4615-1529-6•
Foundations of Image Understanding

[...]

Larry S. Davis
1 Oct 2001
TL;DR: This paper presents a meta-modelling framework for Real-Time Computer Vision that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and annotating images.
Abstract: Preface. Contributing Authors. 1. Summation A. Rosenfeld. 2. Digital Geometry - The Birth of a New Discipline R. Klette. 3. Digital Topology T.Y. Kong. 4. Fuzzy Mathematics J.N. Mordeson. 5. Picture Languages A. Nakamura. 6. Parallel Image Processing A.Y. Wu. 7. Object Representations H. Samet. 8. Texture Classification and Segmentation R. Chellappa, B.S. Manjunath. 9. Edge Measures Using Similarity Regions M.K. Singh, N. Ahuja. 10. Relaxation Labeling: 25 Years and Still Iterating S.W. Zucker. 11. From a Robust Hierarchy to a Hierarchy of Robustness P. Meer. 12. A Pyramid Framework for Real-Time Computer Vision P.J. Burt. 13. On the Computational Modeling of Human Vision J. Beck. 14. Statistics Explains Geometrical Optical Illusions C. Fermuller, Y. Aloimonos. 15. Optics for OmniStereo Imaging Y. Pritch, et al. 16. Volumetric Scene Reconstruction from Multiple Views C.R. Dyer. Index.

96 citations

Journal Article•10.1016/S0030-4018(01)01462-6•
Image representation and compression with the fractional Fourier transform

[...]

I Şamil Yetik1, M. Alper Kutay2, Haldun M. Ozaktas3•
University of Illinois at Chicago1, Scientific and Technological Research Council of Turkey2, Bilkent University3
01 Jan 2001-Optics Communications
TL;DR: The results presented correspond to the basic method without any refinement or combination with other techniques, suggesting that the approach may hold promise for future development.

47 citations

Journal Article•10.1016/S0164-1212(00)00095-9•
A fast content-based indexing and retrieval technique by the shape information in large image database

[...]

Dong-Ho Lee1, Hyoung-Joo Kim1•
Seoul National University1
01 Mar 2001-Journal of Systems and Software
TL;DR: An efficient content-based image retrieval (CBIR) system which employs the shape information of images to facilitate the retrieval process and it is shown that the image indexing method supports faster retrieval than other multi-dimensional indexing methods such as the R*-tree.

32 citations

Journal Article•10.1142/S1469026801000342•
Learning iterative image reconstruction in the neural abstraction pyramid

[...]

Sven Behnke1•
Free University of Berlin1
01 Dec 2001-International Journal of Computational Intelligence and Applications
TL;DR: This work proposes to use recurrent neural networks for both analysis and synthesis of image reconstruction, which makes it possible to use partial results as context information to resolve ambiguities.
Abstract: Successful image reconstruction requires the recognition of a scene and the generation of a clean image of that scene. We propose to use recurrent neural networks for both analysis and synthesis. The networks have a hierarchical architecture that represents images in multiple scales with different degrees of abstraction. The mapping between these representations is mediated by a local connection structure. We supply the networks with degraded images and train them to reconstruct the originals iteratively. This iterative reconstruction makes it possible to use partial results as context information to resolve ambiguities. We demonstrate the power of the approach using three examples: superresolution, fill-in of occluded parts, and noise removal/contrast enhancement. We also reconstruct images from sequences of degraded images.

28 citations

Patent•
Apparatus and method for generating a three-dimensional representation from a two-dimensional image

[...]

John D. Ives, Timothy Ridgewood Parr
8 Mar 2001
TL;DR: In this paper, an apparatus and method for generating and/or obtaining a three-dimensional representation from a two-dimensional image and, in particular, an apparatus for generating a 3D image from the 2D image was presented.
Abstract: The present invention pertains to an apparatus and method for generating and/or for obtaining a three-dimensional representation from a two-dimensional image and, in particular, to an apparatus and method for generating a three-dimensional image from the two-dimensional image.

27 citations

Proceedings Article•10.1109/ICIP.2001.958258•
A fast algorithm for accurate content-adaptive mesh generation

[...]

Yongyi Yang1, Miles N. Wernick, Jovan G. Brankov•
Illinois Institute of Technology1
7 Oct 2001
TL;DR: A theoretical basis for a computationally efficient approach to content-adaptive mesh generation used for image representation is provided, which leads to an improved version of the algorithm.
Abstract: Previously, we proposed a computationally efficient approach to content-adaptive mesh generation used for image representation (see Lee, J. et al., IEEE Int. Conf. Image Proc., 2000). We now provide a theoretical basis for that method, which leads to an improved version of the algorithm. An error bound is derived for a mesh representation of an image based on the theory of function interpolation. From this result, a more accurate scheme is proposed for placement of mesh elements in the image domain according to the image content. Experimental results, compared to other methods, show that a highly accurate image representation can be obtained at extremely low computational cost by the proposed technique.

24 citations

Journal Article•10.1016/S0167-8655(00)00063-5•
Soft image segmentation by weighted linked pyramid

[...]

David Prewer1, Les Kitchen1•
University of Melbourne1
01 Feb 2001-Pattern Recognition Letters
TL;DR: It is proposed that soft segmentation is a more natural way to segment digital image data than crisp segmentation and one method of deriving aSoft segmentation from a weighted linked pyramid algorithm is shown.

24 citations

Proceedings Article•10.1109/DFUA.2001.985909•
Quality assessment of decision-driven pyramid-based fusion of high resolution multispectral with panchromatic image data

[...]

Bruno Aiazzi, Luciano Alparone, Stefano Baronti, Ivan Pippi
8 Nov 2001
TL;DR: In this article, the generalized Laplacian pyramid is used to fuse multispectral data with high-resolution panchromatic images, and a decision based on thresholding the local CC is utilized to check the physical congruence of fusion, while the ratio of local RMSs between the two images provides a space-varying gain factor by which the injected highpass contribution is equalized.
Abstract: This work presents a general and formal solution to the problem of fusion of multispectral data with high-resolution panchromatic images. The method relies on the generalized Laplacian pyramid, which is an oversampled structure obtained by subtracting from an image its lowpass version, and selectively performs spatial-frequencies spectrum substitution from one image to another. The novelty of the present work is that a decision based on thresholding the local CC is utilized to check the physical, congruence of fusion, while the ratio of local RMSs between the two images provides a space-varying gain factor by which the injected highpass contribution is equalized. Since the pyramid decomposition is not critically-subsampled, possible impairments in the fused images, due to missing cancellation of aliasing terms, are avoided. Quantitative results are presented and discussed on simulated SPOT 5 data of an urban area (2.5 m P, 10 m XS) obtained from the MIVIS airborne imaging spectrometer.

23 citations

Patent•
Encoding method for compression of video sequence

[...]

Felts B, Pesquet-Popescu B, Bottreau
14 Nov 2001
TL;DR: In this article, the SPIHT algorithm is used to transform the original set of picture elements (pixels) of each group of frames into transform coefficients constituting a hierarchical pyramid in which a spatio-temporal orientation tree is formed with the pixels of the approximation subband resulting from the 3D wavelet transform.
Abstract: The invention relates to an encoding method for the compression of a video sequence including successive frames organized in groups of frames. Each frame is decomposed by means of a three-dimensional (3D) wavelet transform leading to a given number of successive resolution levels. This method is based on the SPIHT algorithm that transforms the original set of picture elements (pixels) of each group of frames into transform coefficients constituting a hierarchical pyramid in which a spatio-temporal orientation tree-in which the roots are formed with the pixels of the approximation subband resulting from the 3D wavelet transform and the offspring of each of these pixels is formed with the pixels of the higher subbands corresponding to the image volume defined by these root pixels-defines the spatio-temporal relationship. According to the invention, a full exploration of the subbands is performed during the initialization step of the process, and the set significance level of each subtree in the root pixels is calculated and stored. In the sorting step for the process, a comparison between said set significance level and the current significance level n replaces the call to the function that computes the significance of a tree relatively to n.
Journal Article•10.1007/BF03190354•
The skeleton structure--an improved compression algorithm with perfect reconstruction.

[...]

Dragos Nicolae Vizireanu1, C. Pirnog1, V. Lazarescu1, A. Vizireanu1•
Politehnica University of Bucharest1
01 Jun 2001-Journal of Digital Imaging
TL;DR: An improved morphological image representation that can be used for image compression, obtaining very high compression rates is presented.
Abstract: This article presents an improved morphological image representation that can be used for image compression, obtaining very high compression rates. The new image representation described in this work is called skeleton structure and is a natural extension of the morphologic structure. This article will present its theoretical background, introduce the new representation, and show some application examples.
Proceedings Article•10.1109/IJCNN.2001.938470•
A multilayer RBF network and its supervised learning

[...]

Jinhui Chao1, M. Hoshino1, T. Kitamura1, T. Masuda1•
Chuo University1
15 Jul 2001
TL;DR: Simulations show higher representation and generalization capability of the proposed networks comparing with the RBF and multilayer networks with sigmoid activation functions.
Abstract: A general form of multilayer RBF networks is introduced. Complete supervised training rules for parameters are also presented. To achieve global convergence we apply a global optimization algorithm called the magic-brush method. This network can be naturally extended into a pyramid topology. Simulations show higher representation and generalization capability of the proposed networks comparing with the RBF and multilayer networks with sigmoid activation functions.
Journal Article•10.1016/S0167-8655(01)00106-4•
New progressive image transmission based on quadtree and shading approach with resolution control

[...]

Kuo-Liang Chung1, Shou-Yi Tseng1•
National Taiwan University of Science and Technology1
01 Dec 2001-Pattern Recognition Letters
TL;DR: Experimental results reveal that under the similar peak signal to noise ratio (PSNR) and bits per pixel (bpp), the proposed PIT scheme has a better feature-preserving capability when compared to the reduced-difference pyramid PIT scheme.
Journal Article•10.1016/S0167-8655(00)00136-7•
An improved search algorithm for vector quantization using mean pyramid structure

[...]

Su-Juan Lin1, Kuo-Liang Chung1, Lung-Chun Chang1•
National Taiwan University of Science and Technology1
01 Mar 2001-Pattern Recognition Letters
TL;DR: An improved search algorithm for vector quantization using mean pyramid structure and the range search approach is presented, which reduces search times and improves the previous result by Lee and Chen.
Book Chapter•10.1007/3-540-47778-0_37•
Robust Multi-scale Non-rigid Registration of 3D Ultrasound Images

[...]

Ioannis Pratikakis, Christian Barillot, Pierre Hellier
07 Jul 2001-Lecture Notes in Computer Science
TL;DR: A focusing strategy from coarse-to-fine scales which leads to an improvement of the accuracy in the registration process of an automatic 3D non-rigid registration method in a multi-scale framework is introduced.
Abstract: In this paper, we embed the minimization scheme of an automatic 3D non-rigid registration method in a multi-scale framework. The initial model formulation was expressed as a robust multiresolution and multigrid minimization scheme. At the finest level of the multiresolution pyramid, we introduce a focusing strategy from coarse-to-fine scales which leads to an improvement of the accuracy in the registration process. A focusing strategy has been tested for a linear and a non-linear scale-space. Results on 3D Ultrasound images are discussed.
Proceedings Article•10.1117/12.421097•
Optimizing multiresolution pixel-level image fusion

[...]

Vladimir Petrovic, Costas Xydeas1•
Lancaster University1
22 Mar 2001-Proceedings of SPIE
TL;DR: Fusion systems based on derivatives of Gaussian low-pass pyramid and the Discrete Wavelet transform are examined and their performances versus decomposition/selection parameters are defined and compared.
Abstract: A number of pixel level image fusion schemes have been proposed in the past which combine registered input sensor images into a single fused output image. The two general objectives that underpin the operations of these schemes are a) the transfer of all visually important information form input images into a fused image and b) the minimization of undesirable distortions and artifacts which may be generated in the fused image. Fusion is usually achieved by i) the decomposition of input images into representations of their spectral bands and ii) a selection process which transfers information from input bands to yield the required representation of a single fused output image. Furthermore, decomposition is often based on multi-resolution pyramidal representations and the selection process operates on corresponding input image pyramidal levels using selection templates which focus on local spectral characteristics. The performance of such a multi-resolution pixel level image fusion system depends primarily on the actual decomposition and selection algorithms used. Thus for a given decomposition selection arrangement, fusion performance is dependent on the pyramid size (i.e. number of level) and template size. Pyramid and template sizes on the other hand greatly influence the system's computational complexity. This paper is concerned with the performance optimization/characterization of several multi- resolution image fusion schemes, in general and with performance/ complexity trade-offs in particular. Performance is measured using a subjectively meaningful, objective fusion metric which has been proposed recently by authors and which is based on the preservation of image edge information. Thus fusion systems based on derivatives of Gaussian low-pass pyramid and the Discrete Wavelet transform are examined and their performances versus decomposition/selection parameters are defined and compared. The performance/algorithmic complexity results presented for these multi-resolution fusion systems highlight clearly their strengths and weaknesses.© (2001) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.
Journal Article•10.1016/S0167-8655(01)00002-2•
Shape and topology preserving multi-valued image pyramids for multi-resolution skeletonization

[...]

Gunilla Borgefors, Giuliana Ramella1, Gabriella Sanniti di Baja1•
ARCO1
01 May 2001-Pattern Recognition Letters
TL;DR: Starting from a binary digital image, a multi-valued pyramid is built and suitably treated, so that shape and topology properties of the pattern are preserved satisfactorily at all resolution level.
Book Chapter•10.1007/3-540-45453-5_133•
Adaptive Processing of Tree-Structure Image Representation

[...]

Zhiyong Wang1, Zheru Chi1, Dagan Feng1, Siu-Yeung Cho1•
Hong Kong Polytechnic University1
24 Oct 2001
TL;DR: In this paper, a segmentation-free tree-structure image representation is presented and a back-propagation through structure (BPTS) algorithm is adopted in order to learn the structure representation.
Abstract: Much research on image analysis and processing has been carried out for the last few decades. However, it is still challenging to represent the image contents effectively and satisfactorily. In this paper, a segmentation-free tree-structure image representation is presented. In order to learn the structure representation, a back-propagation through structure (BPTS) algorithm is adopted. Experiments on plant image classification and retrieval refining using only six visual features were conducted on a plant image database and a natural scene image database, respectively. Encouraging results have been achieved.
Journal Article•10.1016/S0031-3203(00)00008-X•
An adaptive algorithm for conversion from quadtree to chain codes

[...]

Frank Y. Shih1, Wai Tak Wong1•
New Jersey Institute of Technology1
01 Mar 2001-Pattern Recognition
TL;DR: An adaptive algorithm is presented for converting the quadtree representation of a binary image to its chain code representation by constructing the chain codes of the resulting quadtree of the Boolean operation of two quadtrees by re-using the original chain codes.
Book Chapter•10.1201/B14024-7•
System Integration and Signal/Image Processing

[...]

Joseph A. Izatt, Andrew M. Rollins, Rujchai Ung-arunyawee, Siavash Yazdanfar, Manish D. Kulkarni 
2 Nov 2001
Proceedings Article•10.1145/500141.500189•
An image watermarking technique using pyramid transform

[...]

Qiang Cheng1, Thomas S. Huang1•
University of Illinois at Urbana–Champaign1
1 Oct 2001
TL;DR: An image watermarking technique based on pyramid transforms that has high imperceptibility, good robustness, and accurate detection and can be applied to copyright notification, enforcement, and fingerprinting is proposed.
Abstract: An image watermarking technique based on pyramid transforms is proposed. An arbitrary binary pattern is formed into an effective hypothesized pattern and transmitted as a watermark. Multiresolution pyramid transforms are applied to host images, whose characteristics are exploited to embed the watermark. The detector is designed to be effective to a wide range of original signal sources and noise sources. The scheme is designed to achieve efficient trade-offs between perceptual invisibility, robustness and trustworthy detection. The experiments demonstrate that the proposed technique has high imperceptibility, good robustness, and accurate detection. It can be applied to copyright notification, enforcement, and fingerprinting.
Proceedings Article•10.1109/ICIP.2001.958945•
Multispectral image retrieval using vector quantization

[...]

Toshio Uchiyama, Masahiro Yamaguchi, Nagaaki Ohyama, Naoki Mukawa, H. Kaneko 
7 Oct 2001
TL;DR: An efficient feature representation and a novel method for the retrieval of images by quantizing each image adaptively, based on vector quantization are presented.
Abstract: A novel method for multispectral image retrieval is presented. This method uses a representation of image features based on vector quantization. Feature representation is important for image retrieval, but there are difficulties in applying conventional histogram-based representations to multispectral images. We developed an efficient feature representation and a novel method for the retrieval of images by quantizing each image adaptively.
Journal Article•10.1006/RTIM.2001.0268•
A Pyramid Approach to Motion Tracking

[...]

Jason Z. Zhang1, Q. M. Jonathan Wu1•
National Research Council1
01 Dec 2001-Real-time Imaging
TL;DR: To demonstrate the superiority of the multiresolution tracking algorithm in the connection to parallel computation, a scheme for mapping the tracking algorithm into a Transputer-based pyramidal parallel computing structure is proposed in the paper.
Abstract: This paper presents a multiresolution approach to visual motion tracking. In the approach, the foveation mechanism of the human visual system is used to model the multiresolution information perception algorithms of a Transputer-based pyramid visual tracking system. The video images of a moving target are transformed into pyramidal data structures, each of those images consists of multiple image layers with different resolutions by a Gaussian pyramid generation algorithm. The tracking of a moving target over an image sequence is accomplished by performing a foveal search that is based on an iterative intensity pattern correlation along the multiple resolution levels of the Gaussian pyramids of two successive images. Analyses are given as to the efficiency and accuracy of our tracking algorithm, showing that the algorithm is over 160 times faster than conventional mono-resolution tracking methods, with the tracking error within one pixel. To demonstrate the superiority of the multiresolution tracking algorithm in the connection to parallel computation, a scheme for mapping the tracking algorithm into a Transputer-based pyramidal parallel computing structure is proposed in the paper. Experimental results demonstrate good performance of the proposed approach.
Patent•
Method, system, and software for signal processing using pyramidal decomposition

[...]

Albert D. Edgar1•
Eastman Kodak Company1
5 Feb 2001
TL;DR: In this article, a base signal is recursively decomposed and modified for a desired number of pyramid levels, at each level, the decomposed signal from the previous level is modified to improve one or more signal components or characteristics.
Abstract: A method, system, and software are disclosed for improving the quality of a signal. A base signal is recursively decomposed and modified for a desired number of pyramid levels. At each level, the decomposed signal from the previous level is modified to improve one or more signal components or characteristics. The modified signal from a given level is then decomposed for the next level of the pyramidal decomposition for each pyramid level. Starting at the second to last level of the pyramidal decomposition, the improved signal of the last pyramid level is recomposed and then combined with one or more signals from the current pyramid level, resulting in an improved signal for the current level. The recomposition and combination of the improved signal of the previous level occurs for each level until the top, or level 0, of the pyramidal decomposition is reached. The improved base signal may or may not be combined with the original base signal, depending on the desired outcome. The present invention finds particular application in photography and digital film processing, whereby the illustrated method may be used to improve image quality.
Patent•
Multiplierless pyramid filter

[...]

Tinku Acharya1•
Intel1
27 Dec 2001
TL;DR: In this paper, a multiplierless pyramid filter is described comprising a sequence of scalable cascaded units, each of said units comprising a delay unit and three the adders, with the delay unit coupled to produce a higher order pyramidally filtered output signal sample stream and state variable sample stream.
Abstract: A multiplierless pyramid filter is described comprising a sequence of scalable cascaded units, each of said units comprising a delay unit and three the adders, said delay unit the adders being coupled to produce a higher order pyramidally filtered output signal sample stream and state variable sample stream from an input signal sample stream and a lower order pyramidally filtered output signal sample stream and state variable signal stream.
Proceedings Article•10.1109/ICIP.2001.958466•
Multiscale image processing using normal triangulated meshes

[...]

Maarten Jansen1, Hyeokho Choi1, S. Lavu1, Richard G. Baraniuk2•
Rice University1, Houston Methodist Hospital2
7 Oct 2001
TL;DR: This work proposes an image representation and processing framework using a multiscale triangulation of the grayscale function, and demonstrates the approximation performance of the normal mesh representation through mathematical analyses for simple functions and simulations for real images.
Abstract: Multiresolution triangulation meshes are widely used in computer graphics for 3D modeling of shapes. We propose an image representation and processing framework using a multiscale triangulation of the grayscale function. Triangles have the potential of approximating edges better than the blocky structures of tensor-product wavelets. Among the many possible triangulation schemes, normal meshes are natural for efficiently representing singularities in image data, thanks to their adaptivity to the smoothness of the modeled image. Our non-linear, multiscale image decomposition algorithm, based on this subdivision scheme, takes edges into account in a way that is closely related to wedgelets and curvelets. The highly adaptive property of the normal mesh construction provides a very efficient representation of images, which potentially outperforms standard wavelet transforms. We demonstrate the approximation performance of the normal mesh representation through mathematical analyses for simple functions and simulations for real images.
Proceedings Article•10.1109/ICIP.2001.958645•
Multiresolution Gaussian mixture models for visual motion estimation

[...]

Roland Wilson1, Andrew Calway•
University of Warwick1
7 Oct 2001
TL;DR: A new generalisation of scale-space and pyramids, which combines statistical modelling with a spatial representation using the familiar concept of multiple resolutions, but applied to a Gaussian mixture representation of the image - hence the title MGMM.
Abstract: This paper introduces a new generalisation of scale-space and pyramids, which combines statistical modelling with a spatial representation. The representation uses the familiar concept of multiple resolutions, but applied to a Gaussian mixture representation of the image - hence the title MGMM. It is shown that MGMM can approximate any probability density and can adapt to smooth motions. After a presentation of the theory, it is shown how MGMM can be applied to the estimation of visual motion.
Patent•
Vector quantization of images

[...]

William Paul Cockshott1•
University of Glasgow1
6 Mar 2001
TL;DR: In this article, a method of compressing an image is described in which digital data signals in a 2D images are formed into an image data pyramid with a number of layers and each layer is processed to give a compressed encoding in an ordered list.
Abstract: A method of compressing an image is described in which digital data signals in a 2-dimensional images are formed into an image data pyramid with a number of layers and each layer is processed to give a compressed encoding in an ordered list. The encoding with the largest quality gain factor is selected first and added to a compressed representation of the data array. This is repeated for the next largest gain factor and so on until a predetermined maximum is reached. Each layer of the image data pyramid corresponds to different frequency bands, the vector quantizations of these layers will only minimally interfere with one another. This allows a simple ordering of all possible gain contributions made by the compressed encodings, to the compressed representation. This in turn allows a straightforward selection of the compressed encodings having the largest quality gain factors, for compiling the compressed representation of the image.
Proceedings Article•10.1109/ISCAS.2001.921051•
Content adaptive motion estimation for mobile video encoders

[...]

A. Ahmed1, S.K. Nandy, P. Sathya•
Indian Institute of Science1
6 May 2001
TL;DR: A block matching motion estimation algorithm whose computations are content complexity adaptive, made macroblock adaptive by dynamically varying the number of candidate motion vectors passed to lower levels, depending on the frequency characteristics of the macroblock being matched and the complexity in the sequence for such characteristics.
Abstract: Power consumption has emerged as an important constraint in the design of mobile video encoders. As motion estimation accounts for the majority of the total computations involved in video encoding, the algorithm and architecture used affect the quality and power levels of the final solution. In this paper, we present a block matching motion estimation algorithm whose computations are content complexity adaptive. The basic framework used is the multi-resolution mean pyramid technique. The algorithm is made macroblock adaptive by dynamically varying the number of candidate motion vectors passed to lower levels, depending on the frequency characteristics of the macroblock being matched and the complexity in the sequence for such characteristics. We use the concept of a deviation pyramid in order to estimate the macroblock frequency characteristics. Simulation results show that for typical videophony sequences, the algorithm reduces computational complexity by a factor ranging from 15.5 to 74.0, while maintaining PSNR values close to that obtained by using the full-search block matching algorithm. Simple operations are used in the algorithm to ensure applicability of the proposed algorithm for hardware implementation.

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve