Top 62 papers published in the topic of Pyramid (image processing) in 2001

Showing papers on "Pyramid (image processing) published in 2001"

Proceedings Article•10.1109/CVPR.2001.990645•

Compact representation of bidirectional texture functions

[...]

Oana G. Cula¹, Kristin J. Dana¹•Institutions (1)

1 Dec 2001

TL;DR: A representation is constructed which captures the underlying statistical distribution of features in the image texture as well as the variations in this distribution with viewing and illumination direction and is a compact representation and a recognition method where a single novel image of unknown viewing and illuminated direction can be classified efficiently.

...read moreread less

Abstract: A bidirectional texture function (BTF) describes image texture as it varies with viewing and illumination direction. Many real world surfaces such as skin, fur, gravel, etc. exhibit fine-scale geometric surface detail. Accordingly, variations in appearance with viewing and illumination direction may be quite complex due to local foreshortening, masking and shadowing. Representations of surface texture that support robust recognition must account for these effects. We construct a representation which captures the underlying statistical distribution of features in the image texture as well as the variations in this distribution with viewing and illumination direction. The representation combines clustering to learn characteristic image features and principle components analysis to reduce the space of feature histograms. This representation is based on a core image set as determined by a quantitative evaluation of importance of individual images in the overall representation. The result is a compact representation and a recognition method where a single novel image of unknown viewing and illumination direction can be classified efficiently. The CUReT (Columbia-Utrecht reflectance and texture) database is used as a test set for evaluation of these methods.

...read moreread less

227 citations

Proceedings Article•10.1109/ICIP.2001.958075•

Pyramidal directional filter banks and curvelets

[...]

Minh N. Do, Martin Vetterli¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

7 Oct 2001

TL;DR: A flexible multiscale and directional representation for images is proposed that combines directional filter banks with the Laplacian pyramid to provide a sparse representation for two-dimensional piecewise smooth signals resembling images.

...read moreread less

Abstract: A flexible multiscale and directional representation for images is proposed. The scheme combines directional filter banks with the Laplacian pyramid to provide a sparse representation for two-dimensional piecewise smooth signals resembling images. The underlying expansion is a frame and can be designed to be a tight frame. Pyramidal directional filter banks provide an effective method to implement the digital curvelet transform. The regularity issue of the iterated filters in the directional filter bank is examined.

...read moreread less

185 citations

Book•10.1007/978-1-4615-1529-6•

Foundations of Image Understanding

[...]

Larry S. Davis

1 Oct 2001

TL;DR: This paper presents a meta-modelling framework for Real-Time Computer Vision that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and annotating images.

...read moreread less

Abstract: Preface. Contributing Authors. 1. Summation A. Rosenfeld. 2. Digital Geometry - The Birth of a New Discipline R. Klette. 3. Digital Topology T.Y. Kong. 4. Fuzzy Mathematics J.N. Mordeson. 5. Picture Languages A. Nakamura. 6. Parallel Image Processing A.Y. Wu. 7. Object Representations H. Samet. 8. Texture Classification and Segmentation R. Chellappa, B.S. Manjunath. 9. Edge Measures Using Similarity Regions M.K. Singh, N. Ahuja. 10. Relaxation Labeling: 25 Years and Still Iterating S.W. Zucker. 11. From a Robust Hierarchy to a Hierarchy of Robustness P. Meer. 12. A Pyramid Framework for Real-Time Computer Vision P.J. Burt. 13. On the Computational Modeling of Human Vision J. Beck. 14. Statistics Explains Geometrical Optical Illusions C. Fermuller, Y. Aloimonos. 15. Optics for OmniStereo Imaging Y. Pritch, et al. 16. Volumetric Scene Reconstruction from Multiple Views C.R. Dyer. Index.

...read moreread less

96 citations

Journal Article•10.1016/S0030-4018(01)01462-6•

Image representation and compression with the fractional Fourier transform

[...]

I Şamil Yetik¹, M. Alper Kutay², Haldun M. Ozaktas³•Institutions (3)

University of Illinois at Chicago¹, Scientific and Technological Research Council of Turkey², Bilkent University³

01 Jan 2001-Optics Communications

TL;DR: The results presented correspond to the basic method without any refinement or combination with other techniques, suggesting that the approach may hold promise for future development.

...read moreread less

47 citations

Journal Article•10.1016/S0164-1212(00)00095-9•

A fast content-based indexing and retrieval technique by the shape information in large image database

[...]

Dong-Ho Lee¹, Hyoung-Joo Kim¹•Institutions (1)

Seoul National University¹

01 Mar 2001-Journal of Systems and Software

TL;DR: An efficient content-based image retrieval (CBIR) system which employs the shape information of images to facilitate the retrieval process and it is shown that the image indexing method supports faster retrieval than other multi-dimensional indexing methods such as the R*-tree.

...read moreread less

32 citations

Journal Article•10.1142/S1469026801000342•

Learning iterative image reconstruction in the neural abstraction pyramid

[...]

Sven Behnke¹•Institutions (1)

Free University of Berlin¹

01 Dec 2001-International Journal of Computational Intelligence and Applications

TL;DR: This work proposes to use recurrent neural networks for both analysis and synthesis of image reconstruction, which makes it possible to use partial results as context information to resolve ambiguities.

...read moreread less

Abstract: Successful image reconstruction requires the recognition of a scene and the generation of a clean image of that scene. We propose to use recurrent neural networks for both analysis and synthesis. The networks have a hierarchical architecture that represents images in multiple scales with different degrees of abstraction. The mapping between these representations is mediated by a local connection structure. We supply the networks with degraded images and train them to reconstruct the originals iteratively. This iterative reconstruction makes it possible to use partial results as context information to resolve ambiguities. We demonstrate the power of the approach using three examples: superresolution, fill-in of occluded parts, and noise removal/contrast enhancement. We also reconstruct images from sequences of degraded images.

...read moreread less

28 citations

Patent•

Apparatus and method for generating a three-dimensional representation from a two-dimensional image

[...]

John D. Ives, Timothy Ridgewood Parr

8 Mar 2001

TL;DR: In this paper, an apparatus and method for generating and/or obtaining a three-dimensional representation from a two-dimensional image and, in particular, an apparatus for generating a 3D image from the 2D image was presented.

...read moreread less

Abstract: The present invention pertains to an apparatus and method for generating and/or for obtaining a three-dimensional representation from a two-dimensional image and, in particular, to an apparatus and method for generating a three-dimensional image from the two-dimensional image.

...read moreread less

27 citations

Proceedings Article•10.1109/ICIP.2001.958258•

A fast algorithm for accurate content-adaptive mesh generation

[...]

Yongyi Yang¹, Miles N. Wernick, Jovan G. Brankov•Institutions (1)

Illinois Institute of Technology¹

7 Oct 2001

TL;DR: A theoretical basis for a computationally efficient approach to content-adaptive mesh generation used for image representation is provided, which leads to an improved version of the algorithm.

...read moreread less

Abstract: Previously, we proposed a computationally efficient approach to content-adaptive mesh generation used for image representation (see Lee, J. et al., IEEE Int. Conf. Image Proc., 2000). We now provide a theoretical basis for that method, which leads to an improved version of the algorithm. An error bound is derived for a mesh representation of an image based on the theory of function interpolation. From this result, a more accurate scheme is proposed for placement of mesh elements in the image domain according to the image content. Experimental results, compared to other methods, show that a highly accurate image representation can be obtained at extremely low computational cost by the proposed technique.

...read moreread less

24 citations

Journal Article•10.1016/S0167-8655(00)00063-5•

Soft image segmentation by weighted linked pyramid

[...]

David Prewer¹, Les Kitchen¹•Institutions (1)

University of Melbourne¹

01 Feb 2001-Pattern Recognition Letters

TL;DR: It is proposed that soft segmentation is a more natural way to segment digital image data than crisp segmentation and one method of deriving aSoft segmentation from a weighted linked pyramid algorithm is shown.

...read moreread less

24 citations

Proceedings Article•10.1109/DFUA.2001.985909•

Quality assessment of decision-driven pyramid-based fusion of high resolution multispectral with panchromatic image data

[...]

Bruno Aiazzi, Luciano Alparone, Stefano Baronti, Ivan Pippi

8 Nov 2001

TL;DR: In this article, the generalized Laplacian pyramid is used to fuse multispectral data with high-resolution panchromatic images, and a decision based on thresholding the local CC is utilized to check the physical congruence of fusion, while the ratio of local RMSs between the two images provides a space-varying gain factor by which the injected highpass contribution is equalized.

...read moreread less

Abstract: This work presents a general and formal solution to the problem of fusion of multispectral data with high-resolution panchromatic images. The method relies on the generalized Laplacian pyramid, which is an oversampled structure obtained by subtracting from an image its lowpass version, and selectively performs spatial-frequencies spectrum substitution from one image to another. The novelty of the present work is that a decision based on thresholding the local CC is utilized to check the physical, congruence of fusion, while the ratio of local RMSs between the two images provides a space-varying gain factor by which the injected highpass contribution is equalized. Since the pyramid decomposition is not critically-subsampled, possible impairments in the fused images, due to missing cancellation of aliasing terms, are avoided. Quantitative results are presented and discussed on simulated SPOT 5 data of an urban area (2.5 m P, 10 m XS) obtained from the MIVIS airborne imaging spectrometer.

...read moreread less

23 citations

Patent•

Encoding method for compression of video sequence

[...]

Felts B, Pesquet-Popescu B, Bottreau

14 Nov 2001

TL;DR: In this article, the SPIHT algorithm is used to transform the original set of picture elements (pixels) of each group of frames into transform coefficients constituting a hierarchical pyramid in which a spatio-temporal orientation tree is formed with the pixels of the approximation subband resulting from the 3D wavelet transform.

...read moreread less

Abstract: The invention relates to an encoding method for the compression of a video sequence including successive frames organized in groups of frames. Each frame is decomposed by means of a three-dimensional (3D) wavelet transform leading to a given number of successive resolution levels. This method is based on the SPIHT algorithm that transforms the original set of picture elements (pixels) of each group of frames into transform coefficients constituting a hierarchical pyramid in which a spatio-temporal orientation tree-in which the roots are formed with the pixels of the approximation subband resulting from the 3D wavelet transform and the offspring of each of these pixels is formed with the pixels of the higher subbands corresponding to the image volume defined by these root pixels-defines the spatio-temporal relationship. According to the invention, a full exploration of the subbands is performed during the initialization step of the process, and the set significance level of each subtree in the root pixels is calculated and stored. In the sorting step for the process, a comparison between said set significance level and the current significance level n replaces the call to the function that computes the significance of a tree relatively to n.

...read moreread less

Journal Article•10.1007/BF03190354•

The skeleton structure--an improved compression algorithm with perfect reconstruction.

[...]

Dragos Nicolae Vizireanu¹, C. Pirnog¹, V. Lazarescu¹, A. Vizireanu¹•Institutions (1)

Politehnica University of Bucharest¹

01 Jun 2001-Journal of Digital Imaging

TL;DR: An improved morphological image representation that can be used for image compression, obtaining very high compression rates is presented.

...read moreread less

Abstract: This article presents an improved morphological image representation that can be used for image compression, obtaining very high compression rates. The new image representation described in this work is called skeleton structure and is a natural extension of the morphologic structure. This article will present its theoretical background, introduce the new representation, and show some application examples.

...read moreread less

Proceedings Article•10.1109/IJCNN.2001.938470•

A multilayer RBF network and its supervised learning

[...]

Jinhui Chao¹, M. Hoshino¹, T. Kitamura¹, T. Masuda¹•Institutions (1)

Chuo University¹

15 Jul 2001

TL;DR: Simulations show higher representation and generalization capability of the proposed networks comparing with the RBF and multilayer networks with sigmoid activation functions.

...read moreread less

Abstract: A general form of multilayer RBF networks is introduced. Complete supervised training rules for parameters are also presented. To achieve global convergence we apply a global optimization algorithm called the magic-brush method. This network can be naturally extended into a pyramid topology. Simulations show higher representation and generalization capability of the proposed networks comparing with the RBF and multilayer networks with sigmoid activation functions.

...read moreread less

Journal Article•10.1016/S0167-8655(01)00106-4•

New progressive image transmission based on quadtree and shading approach with resolution control

[...]

Kuo-Liang Chung¹, Shou-Yi Tseng¹•Institutions (1)

National Taiwan University of Science and Technology¹

01 Dec 2001-Pattern Recognition Letters

TL;DR: Experimental results reveal that under the similar peak signal to noise ratio (PSNR) and bits per pixel (bpp), the proposed PIT scheme has a better feature-preserving capability when compared to the reduced-difference pyramid PIT scheme.

...read moreread less

Journal Article•10.1016/S0167-8655(00)00136-7•

An improved search algorithm for vector quantization using mean pyramid structure

[...]

Su-Juan Lin¹, Kuo-Liang Chung¹, Lung-Chun Chang¹•Institutions (1)

National Taiwan University of Science and Technology¹

01 Mar 2001-Pattern Recognition Letters

TL;DR: An improved search algorithm for vector quantization using mean pyramid structure and the range search approach is presented, which reduces search times and improves the previous result by Lee and Chen.

...read moreread less

Book Chapter•10.1007/3-540-47778-0_37•

Robust Multi-scale Non-rigid Registration of 3D Ultrasound Images

[...]

Ioannis Pratikakis, Christian Barillot, Pierre Hellier

07 Jul 2001-Lecture Notes in Computer Science

TL;DR: A focusing strategy from coarse-to-fine scales which leads to an improvement of the accuracy in the registration process of an automatic 3D non-rigid registration method in a multi-scale framework is introduced.

...read moreread less

Abstract: In this paper, we embed the minimization scheme of an automatic 3D non-rigid registration method in a multi-scale framework. The initial model formulation was expressed as a robust multiresolution and multigrid minimization scheme. At the finest level of the multiresolution pyramid, we introduce a focusing strategy from coarse-to-fine scales which leads to an improvement of the accuracy in the registration process. A focusing strategy has been tested for a linear and a non-linear scale-space. Results on 3D Ultrasound images are discussed.

...read moreread less

Proceedings Article•10.1117/12.421097•

Optimizing multiresolution pixel-level image fusion

[...]

Vladimir Petrovic, Costas Xydeas¹•Institutions (1)

Lancaster University¹

22 Mar 2001-Proceedings of SPIE

TL;DR: Fusion systems based on derivatives of Gaussian low-pass pyramid and the Discrete Wavelet transform are examined and their performances versus decomposition/selection parameters are defined and compared.

...read moreread less

Abstract: A number of pixel level image fusion schemes have been proposed in the past which combine registered input sensor images into a single fused output image. The two general objectives that underpin the operations of these schemes are a) the transfer of all visually important information form input images into a fused image and b) the minimization of undesirable distortions and artifacts which may be generated in the fused image. Fusion is usually achieved by i) the decomposition of input images into representations of their spectral bands and ii) a selection process which transfers information from input bands to yield the required representation of a single fused output image. Furthermore, decomposition is often based on multi-resolution pyramidal representations and the selection process operates on corresponding input image pyramidal levels using selection templates which focus on local spectral characteristics. The performance of such a multi-resolution pixel level image fusion system depends primarily on the actual decomposition and selection algorithms used. Thus for a given decomposition selection arrangement, fusion performance is dependent on the pyramid size (i.e. number of level) and template size. Pyramid and template sizes on the other hand greatly influence the system's computational complexity. This paper is concerned with the performance optimization/characterization of several multi- resolution image fusion schemes, in general and with performance/ complexity trade-offs in particular. Performance is measured using a subjectively meaningful, objective fusion metric which has been proposed recently by authors and which is based on the preservation of image edge information. Thus fusion systems based on derivatives of Gaussian low-pass pyramid and the Discrete Wavelet transform are examined and their performances versus decomposition/selection parameters are defined and compared. The performance/algorithmic complexity results presented for these multi-resolution fusion systems highlight clearly their strengths and weaknesses.© (2001) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

Journal Article•10.1016/S0167-8655(01)00002-2•

Shape and topology preserving multi-valued image pyramids for multi-resolution skeletonization

[...]

Gunilla Borgefors, Giuliana Ramella¹, Gabriella Sanniti di Baja¹•Institutions (1)

ARCO¹

01 May 2001-Pattern Recognition Letters

TL;DR: Starting from a binary digital image, a multi-valued pyramid is built and suitably treated, so that shape and topology properties of the pattern are preserved satisfactorily at all resolution level.

...read moreread less

Book Chapter•10.1007/3-540-45453-5_133•

Adaptive Processing of Tree-Structure Image Representation

[...]

Zhiyong Wang¹, Zheru Chi¹, Dagan Feng¹, Siu-Yeung Cho¹•Institutions (1)

Hong Kong Polytechnic University¹

24 Oct 2001

TL;DR: In this paper, a segmentation-free tree-structure image representation is presented and a back-propagation through structure (BPTS) algorithm is adopted in order to learn the structure representation.

...read moreread less

Abstract: Much research on image analysis and processing has been carried out for the last few decades. However, it is still challenging to represent the image contents effectively and satisfactorily. In this paper, a segmentation-free tree-structure image representation is presented. In order to learn the structure representation, a back-propagation through structure (BPTS) algorithm is adopted. Experiments on plant image classification and retrieval refining using only six visual features were conducted on a plant image database and a natural scene image database, respectively. Encouraging results have been achieved.

...read moreread less

Journal Article•10.1016/S0031-3203(00)00008-X•

An adaptive algorithm for conversion from quadtree to chain codes

[...]

Frank Y. Shih¹, Wai Tak Wong¹•Institutions (1)

New Jersey Institute of Technology¹

01 Mar 2001-Pattern Recognition

TL;DR: An adaptive algorithm is presented for converting the quadtree representation of a binary image to its chain code representation by constructing the chain codes of the resulting quadtree of the Boolean operation of two quadtrees by re-using the original chain codes.

...read moreread less

Book Chapter•10.1201/B14024-7•

System Integration and Signal/Image Processing

[...]

Joseph A. Izatt, Andrew M. Rollins, Rujchai Ung-arunyawee, Siavash Yazdanfar, Manish D. Kulkarni - Show less +1 more

2 Nov 2001

Proceedings Article•10.1145/500141.500189•

An image watermarking technique using pyramid transform

[...]

Qiang Cheng¹, Thomas S. Huang¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

1 Oct 2001

TL;DR: An image watermarking technique based on pyramid transforms that has high imperceptibility, good robustness, and accurate detection and can be applied to copyright notification, enforcement, and fingerprinting is proposed.

...read moreread less

Abstract: An image watermarking technique based on pyramid transforms is proposed. An arbitrary binary pattern is formed into an effective hypothesized pattern and transmitted as a watermark. Multiresolution pyramid transforms are applied to host images, whose characteristics are exploited to embed the watermark. The detector is designed to be effective to a wide range of original signal sources and noise sources. The scheme is designed to achieve efficient trade-offs between perceptual invisibility, robustness and trustworthy detection. The experiments demonstrate that the proposed technique has high imperceptibility, good robustness, and accurate detection. It can be applied to copyright notification, enforcement, and fingerprinting.

...read moreread less

Proceedings Article•10.1109/ICIP.2001.958945•

Multispectral image retrieval using vector quantization

[...]

Toshio Uchiyama, Masahiro Yamaguchi, Nagaaki Ohyama, Naoki Mukawa, H. Kaneko - Show less +1 more

7 Oct 2001

TL;DR: An efficient feature representation and a novel method for the retrieval of images by quantizing each image adaptively, based on vector quantization are presented.

...read moreread less

Abstract: A novel method for multispectral image retrieval is presented. This method uses a representation of image features based on vector quantization. Feature representation is important for image retrieval, but there are difficulties in applying conventional histogram-based representations to multispectral images. We developed an efficient feature representation and a novel method for the retrieval of images by quantizing each image adaptively.

...read moreread less

Journal Article•10.1006/RTIM.2001.0268•

A Pyramid Approach to Motion Tracking

[...]

Jason Z. Zhang¹, Q. M. Jonathan Wu¹•Institutions (1)

National Research Council¹

01 Dec 2001-Real-time Imaging

TL;DR: To demonstrate the superiority of the multiresolution tracking algorithm in the connection to parallel computation, a scheme for mapping the tracking algorithm into a Transputer-based pyramidal parallel computing structure is proposed in the paper.

...read moreread less

Abstract: This paper presents a multiresolution approach to visual motion tracking. In the approach, the foveation mechanism of the human visual system is used to model the multiresolution information perception algorithms of a Transputer-based pyramid visual tracking system. The video images of a moving target are transformed into pyramidal data structures, each of those images consists of multiple image layers with different resolutions by a Gaussian pyramid generation algorithm. The tracking of a moving target over an image sequence is accomplished by performing a foveal search that is based on an iterative intensity pattern correlation along the multiple resolution levels of the Gaussian pyramids of two successive images. Analyses are given as to the efficiency and accuracy of our tracking algorithm, showing that the algorithm is over 160 times faster than conventional mono-resolution tracking methods, with the tracking error within one pixel. To demonstrate the superiority of the multiresolution tracking algorithm in the connection to parallel computation, a scheme for mapping the tracking algorithm into a Transputer-based pyramidal parallel computing structure is proposed in the paper. Experimental results demonstrate good performance of the proposed approach.

...read moreread less

Patent•

Method, system, and software for signal processing using pyramidal decomposition

[...]

Albert D. Edgar¹•Institutions (1)

Eastman Kodak Company¹

5 Feb 2001

TL;DR: In this article, a base signal is recursively decomposed and modified for a desired number of pyramid levels, at each level, the decomposed signal from the previous level is modified to improve one or more signal components or characteristics.

...read moreread less

Abstract: A method, system, and software are disclosed for improving the quality of a signal. A base signal is recursively decomposed and modified for a desired number of pyramid levels. At each level, the decomposed signal from the previous level is modified to improve one or more signal components or characteristics. The modified signal from a given level is then decomposed for the next level of the pyramidal decomposition for each pyramid level. Starting at the second to last level of the pyramidal decomposition, the improved signal of the last pyramid level is recomposed and then combined with one or more signals from the current pyramid level, resulting in an improved signal for the current level. The recomposition and combination of the improved signal of the previous level occurs for each level until the top, or level 0, of the pyramidal decomposition is reached. The improved base signal may or may not be combined with the original base signal, depending on the desired outcome. The present invention finds particular application in photography and digital film processing, whereby the illustrated method may be used to improve image quality.

...read moreread less

Patent•

Multiplierless pyramid filter

[...]

Tinku Acharya¹•Institutions (1)

Intel¹

27 Dec 2001

TL;DR: In this paper, a multiplierless pyramid filter is described comprising a sequence of scalable cascaded units, each of said units comprising a delay unit and three the adders, with the delay unit coupled to produce a higher order pyramidally filtered output signal sample stream and state variable sample stream.

...read moreread less

Abstract: A multiplierless pyramid filter is described comprising a sequence of scalable cascaded units, each of said units comprising a delay unit and three the adders, said delay unit the adders being coupled to produce a higher order pyramidally filtered output signal sample stream and state variable sample stream from an input signal sample stream and a lower order pyramidally filtered output signal sample stream and state variable signal stream.

...read moreread less

Proceedings Article•10.1109/ICIP.2001.958466•

Multiscale image processing using normal triangulated meshes

[...]

Maarten Jansen¹, Hyeokho Choi¹, S. Lavu¹, Richard G. Baraniuk²•Institutions (2)

Rice University¹, Houston Methodist Hospital²

7 Oct 2001

TL;DR: This work proposes an image representation and processing framework using a multiscale triangulation of the grayscale function, and demonstrates the approximation performance of the normal mesh representation through mathematical analyses for simple functions and simulations for real images.

...read moreread less

Abstract: Multiresolution triangulation meshes are widely used in computer graphics for 3D modeling of shapes. We propose an image representation and processing framework using a multiscale triangulation of the grayscale function. Triangles have the potential of approximating edges better than the blocky structures of tensor-product wavelets. Among the many possible triangulation schemes, normal meshes are natural for efficiently representing singularities in image data, thanks to their adaptivity to the smoothness of the modeled image. Our non-linear, multiscale image decomposition algorithm, based on this subdivision scheme, takes edges into account in a way that is closely related to wedgelets and curvelets. The highly adaptive property of the normal mesh construction provides a very efficient representation of images, which potentially outperforms standard wavelet transforms. We demonstrate the approximation performance of the normal mesh representation through mathematical analyses for simple functions and simulations for real images.

...read moreread less

Proceedings Article•10.1109/ICIP.2001.958645•

Multiresolution Gaussian mixture models for visual motion estimation

[...]

Roland Wilson¹, Andrew Calway•Institutions (1)

University of Warwick¹

7 Oct 2001

TL;DR: A new generalisation of scale-space and pyramids, which combines statistical modelling with a spatial representation using the familiar concept of multiple resolutions, but applied to a Gaussian mixture representation of the image - hence the title MGMM.

...read moreread less

Abstract: This paper introduces a new generalisation of scale-space and pyramids, which combines statistical modelling with a spatial representation. The representation uses the familiar concept of multiple resolutions, but applied to a Gaussian mixture representation of the image - hence the title MGMM. It is shown that MGMM can approximate any probability density and can adapt to smooth motions. After a presentation of the theory, it is shown how MGMM can be applied to the estimation of visual motion.

...read moreread less

Patent•

Vector quantization of images

[...]

William Paul Cockshott¹•Institutions (1)

University of Glasgow¹

6 Mar 2001

TL;DR: In this article, a method of compressing an image is described in which digital data signals in a 2D images are formed into an image data pyramid with a number of layers and each layer is processed to give a compressed encoding in an ordered list.

...read moreread less

Abstract: A method of compressing an image is described in which digital data signals in a 2-dimensional images are formed into an image data pyramid with a number of layers and each layer is processed to give a compressed encoding in an ordered list. The encoding with the largest quality gain factor is selected first and added to a compressed representation of the data array. This is repeated for the next largest gain factor and so on until a predetermined maximum is reached. Each layer of the image data pyramid corresponds to different frequency bands, the vector quantizations of these layers will only minimally interfere with one another. This allows a simple ordering of all possible gain contributions made by the compressed encodings, to the compressed representation. This in turn allows a straightforward selection of the compressed encodings having the largest quality gain factors, for compiling the compressed representation of the image.

...read moreread less

Proceedings Article•10.1109/ISCAS.2001.921051•

Content adaptive motion estimation for mobile video encoders

[...]

A. Ahmed¹, S.K. Nandy, P. Sathya•Institutions (1)

Indian Institute of Science¹

6 May 2001

TL;DR: A block matching motion estimation algorithm whose computations are content complexity adaptive, made macroblock adaptive by dynamically varying the number of candidate motion vectors passed to lower levels, depending on the frequency characteristics of the macroblock being matched and the complexity in the sequence for such characteristics.

...read moreread less

Abstract: Power consumption has emerged as an important constraint in the design of mobile video encoders. As motion estimation accounts for the majority of the total computations involved in video encoding, the algorithm and architecture used affect the quality and power levels of the final solution. In this paper, we present a block matching motion estimation algorithm whose computations are content complexity adaptive. The basic framework used is the multi-resolution mean pyramid technique. The algorithm is made macroblock adaptive by dynamically varying the number of candidate motion vectors passed to lower levels, depending on the frequency characteristics of the macroblock being matched and the complexity in the sequence for such characteristics. We use the concept of a deviation pyramid in order to estimate the macroblock frequency characteristics. Simulation results show that for typical videophony sequences, the algorithm reduces computational complexity by a factor ranging from 15.5 to 74.0, while maintaining PSNR values close to that obtained by using the full-search block matching algorithm. Simple operations are used in the algorithm to ensure applicability of the proposed algorithm for hardware implementation.

...read moreread less