TL;DR: A VLSI chip, called PYR, is developed to perform the standard filter and resampling operations required in pyramid and inverse pyramid transforms for these applications.
Abstract: Advanced techniques in image processing and computer vision increasingly require that image data be represented at multiple resolutions and at multiple sample rates. Application areas for such pyramid image representations include image compression, image enhancement, motion analysis, and object recognition. We have developed a VLSI chip, called PYR, to perform the standard filter and resampling operations required in pyramid and inverse pyramid transforms for these applications. The PYR chip processes image samples sequentially, in raster scan format, so is suited for pipeline architectures. The user can choose from a set of standard filters, through software control, to construct Gaussian, Laplacian, Subband, and related pyramid structures. A unique feature of the design is that it includes timing signals that are passed with the image data. These signals coordinate successive processing steps in a pipeline system as image sizes and sample rates change. The chip also includes circuits for edge extension and image addition, and it can be run in “spread tap” mode to provide twice the standard sample density. The PYR chip is implemented in standard cell technology. At a clock rate of 15 MHz, a single chip can simultaneously construct a Gaussian and a Laplacian pyramid from a 512 by 480 image in 22.7 msec (44 frame/second).
TL;DR: The design of filter kernels having specified radial and angular frequency responses based on combined optimization and frequency sampling is used to generate small-radius, low-pass, and edge-detection kernels for multiresolution pyramids.
Abstract: Deals with the design of filter kernels having specified radial and angular frequency responses based on combined optimization and frequency sampling. This is used to generate small-radius, low-pass, and edge-detection kernels for multiresolution pyramids. The performance of the new kernels in estimating orientation is shown to be significantly better than that of other commonly used pyramid kernels. >
TL;DR: A new 'reduced-mean' pyramid data structure is proposed, which gives more accurate motion vectors than conventional techniques without the transmission of extra data, and is also more efficient in the presence of large amounts of motion.
Abstract: The current two mainstream motion compensation techniques, pel-recursive and block matching algorithms are first reviewed and experimental results and comments are given. Estimation and motion compensation techniques using image pyramids are then introduced, and a new ‘reduced-mean’ pyramid data structure is proposed, which gives more accurate motion vectors than conventional techniques without the transmission of extra data. Results obtained by the use of our pyramid algorithms show smaller motion compensated frame differences than those of pel-recursive and block matching algorithms, and the pyramid algorithm is also more efficient in the presence of large amounts of motion.
TL;DR: An algorithm for correspondence matching, which is one of the crucial steps in automatic terrain modeling, is introduced and uses well known pyramid-based data structures, but is novel in its direct application of methods from statistical pattern recognition.
Abstract: Navigation and imagery in the orbit, descent, and landing phases during an interplanetary mission require methods that are able to derive the elevation map of a planetary body using remote sensing tools. The authors propose stereovision techniques for this task. An algorithm for correspondence matching, which is one of the crucial steps in automatic terrain modeling, is introduced. It uses well known pyramid-based data structures, but is novel in its direct application of methods from statistical pattern recognition. Feature vectors for correspondence matching and feature selection techniques are used to find optimal features. These include grey-level statistics (mean variance) as well as more sophisticated features derived from operators like local frequency edge gradient or, as an extension, Moravec-, Gabor- or Fourier-features. The applicability of the algorithm in the remote sensing scenario of interplanetary missions is verified using a mockup simulation of the Martian surface. >
TL;DR: In this article, a pyramid of normal vectors is constructed from the distribution given by the underlying bump map and the distributions are represented as sums of a small number of Phong-like spreads of normal vector.
Abstract: ``Bump'''' mapping is a variant of texture mapping where the texture information is used to alter the surface normal. Current techniques to pre-filter textures are all relying on the fact that the texture information can be linearly "factored out" of the shading equation, and therefore can be pre-averaged in some way. This is not the case with bump maps, and those techniques fail to filter them correctly. We propose here a technique to pre-filter bump maps by building a pyramid where each level stores distributions of normal vectors reconstructed from the distribution given by the underlying bump map. The distributions are represented as sums of a small number of Phong-like spreads of normal vectors. The technique, besides allowing an effective and smooth transition between a bump map and a single surface description, gives rise to the concept of a multiple surface, where each point of the single surface is characterized by more than one normal vector. This allows the description of visually complex surfaces by a trivial modification of current local illumination models. When a surface has an underlying microstructure, masking and self-shadowing are important factors in its appearance. Along with the filtering of normals we include the filtering of the masking and self-shadowing information. This is accomplished by computing the limiting angles of visibility and their variance along the two texture axes for the reconstructed distribution of normals. These techniques allow the modeling of any surface whose microstructure we can model geometrically. This includes complex but familiar surfaces such as anisotropic surfaces, many woven cloth, and stochastic surfaces.
TL;DR: A new contour extraction method based on organization of the image data using perceptual grouping rules, and is therefore largely domain independent, allowing the filtering of noisy edges and line segments regardless of their strength.
TL;DR: The algorithm is based on convolutions with simple separable filters and pixel-wise non-linear arithmetic operations, which allow highly parallel implementation, for example on a pyramid machine, yielding real time applications.
TL;DR: This paper describes the design and operation of a new simulation model that models the physical structure, the signal processing, and the visual perception of static displays, to allow optimization of display design parameters through image quality measures.
Abstract: This paper describes the design and operation of a new simulation model for color matrix display development. It models the physical structure, the signal processing, and the visual perception of static displays, to allow optimization of display design parameters through image quality measures. The model is simple, implemented in the Mathematica computer language, and highly modular. Signal processing modules operate on the original image. The hardware modules describe backlights and filters, the pixel shape, and the tiling of the pixels over the display. Small regions of the displayed image can be visualized on a CRT. Visual perception modules assume static foveal images. The image is converted into cone catches and then into luminance, red-green, and blue-yellow images. A Haar transform pyramid separates the three images into spatial frequency and direction-specific channels. The channels are scaled by weights taken from human contrast sensitivity measurements of chromatic and luminance mechanisms at similar frequencies and orientations. Each channel provides a detectability measure. These measures allow the comparison of images displayed on prospective devices and, by that, the optimization of display designs.
TL;DR: An original approach for segmenting still images that is initially decomposed in several levels of different resolution using a Gaussian pyramid and the segmentation is obtained by using a maximum a posteriori criterion.
Abstract: The authors discuss an original approach for segmenting still images. In this approach, the image is initially decomposed in several levels of different resolution. The decomposition that has been chosen is a Gaussian pyramid. At each level of the pyramid, the image is modeled by a compound Gauss-Markov random field and the segmentation is obtained by using a maximum a posteriori criterion. The segmentation is carried out first at the top level of the pyramid. Once a level (l) has been segmented, this segmentation is projected onto the following level below it (l-1). The process is iterated until the segmentation at the bottom level (0) is performed. >
TL;DR: An algorithm for image partitioning which has performed well on piecewise homogeneous synthetic images is described and it is proved that it has useful asymptotic properties in the limit of infinite statistical scale.
Abstract: We address the problem of scale selection in texture analysis. Two different scale parameters, feature scale and statistical scale, are defined. Statistical scale is the size of the regions used to compute averages. We define the class of homogeneous random functions as a model of texture. A dishomogeneity function is defined and we prove that it has useful asymptotic properties in the limit of infinite statistical scale. We describe an algorithm for image partitioning which has performed well on piecewise homogeneous synthetic images. This algorithm is embedded in a redundant pyramid and does not require any ad-hoc information. It selects the optimal statistical scale at each location in the image.
TL;DR: The Warwick Pyramid Machine as discussed by the authors is an architecture consisting of both SIMD and MIMD parts in a multiple-SIMD (MSIMD) organization which can operate effectively at all levels of the image analysis problem.
Abstract: Real-time image analysis requires the use of massively parallel machines. Conventional parallel machines consist of an array of identical processors organized in either single instruction multiple data (SIMD) or multiple instruction multiple data (MIMD) configurations. Machines of this type generally only operate effectively on parts of the image analysis problem. SIMD on the low level processing and MIMD on the high level processing. In this paper we describe the Warwick Pyramid Machine, an architecture consisting of both SIMD and MIMD parts in a multiple-SIMD (MSIMD) organization which can operate effectively at all levels of the image analysis problem.
TL;DR: Since the consistent edge structures permitted at coarse resolution have a much greater spatial extent than those permitted at fine resolution, they provide more powerful constraints on edge connectivity, this means that noise contamination can be efficiently controlled without the use of excessive filtering and consequent band limitation of genuine high frequency image structure.
Abstract: The authors present a hierarchical extension of the probabilistic relaxation method and demonstrate its application to the multiscale processing of edge information. The basic idea is to utilise interlevel constraints on the evolution of edge structure to locate consistent edge labellings at each descending level of a resolution pyramid. Information concerning the whereabouts of consistent edge structure is passed from one layer of the pyramid to another in the form of a labelled edge map. This information is combined with the raw edge data at the relevant level of the pyramid using a Bayesian extension of the probabilistic relaxation formula of A. Rosenfeld, R.A. Hummel and S.W. Zucker; the result is a fine resolution label interpretation. Since the consistent edge structures permitted at coarse resolution have a much greater spatial extent than those permitted at fine resolution, they provide more powerful constraints on edge connectivity. This means that noise contamination can be efficiently controlled without the use of excessive filtering and consequent band limitation of genuine high frequency image structure. >
TL;DR: A database design is presented which integrates multi- scale storage of point, linear and polygonal features, based on the line generalization tree, with a multi-scale surface model based on a Delaunay pyramid.
Abstract: Multiresolution data structures provide a means of retrieving geographical features from a database at levels of detail which are adaptable to different scales of representation. A database design is presented which integrates multi-scale storage of point, linear and polygonal features, based on the line generalization tree, with a multi-scale surface model based on the Delaunay pyramid. The constituent vertices of topologically-structured geographical features are thus distributed between the triangulated levels of a Delaunay pyramid in which triangle edges are constrained to follow those features at differing degrees of generalization. Efficient locational access is achieved by imposing a spatial index on each level of the pyramid.
TL;DR: It is demonstrated that this Wiener-matrix filter can significantly improve the fidelity and visual quality produced by conventional image reconstruction, and the extent of this improvement depends on the design of the image-gathering device.
TL;DR: A multiresolution image representation is proposed as a basis for constructing an approximation to an original image based on adaptive finite elements, a technique used in applied mathematics to solve numerically partial differential equations while preserving important features of the solution at different scales.
Abstract: A multiresolution image representation is proposed as a basis for constructing an approximation to an original image. The method is based on adaptive finite elements, a technique used in applied mathematics to solve numerically partial differential equations while preserving important features of the solution at different scales. Theory and experiments suggest that adaptive finite elements is a natural and computationally-powerful approach to image approximation problems. The particular representation is based on hierarchical finite elements. A multiresolution algorithm computes the solution to the approximation problem in O(N) time on a sequential machine and in O(logN) time on a single-instruction, multiple-data, fine-grain parallel architecture, where N is the number of pixels in the image. Applications to the problems of image compression and restoration are given. >
TL;DR: An approach for characterizing the properties of the basis functions of the Gabor representation in the context of oversampling is presented, based on the concept of frames and utilizes the Piecewise Zak Transform (PZT).
Abstract: An approach for characterizing the properties of the basis functions of the Gabor representation in the contextof oversampling is presented. The approach is based on the concept of frames and utilizes the Piecewise Zak Transform(PZT). The frame operator associated with the Gabor-type frame, the so-called Weyl-Heisenberg frame,is examined for a rational oversampling rate by representing the frame operator as a matrix-valued function in thePZT domain. Completeness and frame properties of the Gabor representation functions are examined in relation tothe properties of the matrix-valued function. The frame bounds are calculated by means of the eigenvalues of thematrix-valued function, and the dual frame, which is used in calculation of the expansion coefficients, is expressedby means of the inverse matrix. 1 INTRODUCTION Wavelets and Gabor-type representation have been found to be useful in effective representation and/or compression of images. The special case of Gaborian pyramid can, as such, be also considered as a wavelet which is most
TL;DR: A multiresolution motion estimation (MRME) algorithm based on the feature representation of image data, which is suitable for real-time VLSI pipeline and parallel processing, implementations with large searching windows for full search motion compensation.
Abstract: A multiresolution motion estimation (MRME) algorithm based on the feature representation of image data, is developed. The sign truncated feature (STF) vector is derived for the feature matching phase of the motion estimation process. MRME can be performed on the STF vectors in full range motion search without the need for a search window control. The MRME algorithm is suitable for real-time VLSI pipeline and parallel processing, implementations with large searching windows for full search motion compensation. It needs 160% extra frame memory for the feature pyramid buffer. It can be more than ten times faster than conventional pixel-by-pixel intensity matching schemes, if a parallel 20-bit template matching operator is used. >
TL;DR: A motion segmentation algorithm is introduced that does coarse-to-fine pyramid-based boundary refinement that attempts to classify the blocks into three classes: inside, border, and outside.
Abstract: A motion segmentation algorithm is introduced. The algorithm is based on the assumption of one coherent moving area (without holes) on a static background. It does coarse-to-fine pyramid-based boundary refinement that attempts to classify the blocks into three classes: inside, border, and outside. >
TL;DR: This chapter describes a technique used to solve the problem of accurately locating the positions of eyes within a particular set of sixty images supplied by BT; half of the images could be used as training data, the other half as test data.
Abstract: This chapter describes a technique used to solve the problem of accurately locating the positions of eyes within a particular set of sixty images supplied by BT; half of the images could be used as training data, the other half as test data. The subjects in each image are at the same viewing distance, the faces are roughly in a vertical position, with a face in the centre of the frame looking forward with eyes open and directed at the camera.
TL;DR: This paper shows how to map a bidimension linear transformation, which has a straightforward multiresolution realization on a pyramid data-parallel computer, onto a pipeline of simple processors, capable of processing images as large as 1024×1024 pixels.
Abstract: This paper presents an implementation of the Haar transform suitable for VLSI integration. It shows how to map a bidimension linear transformation, which has a straightforward multiresolution realization on a pyramid data-parallel computer, onto a pipeline of simple processors. A further simplification of the linear structure leads to an extremely simple implementation based on a two-stage pipeline, capable of processing images as large as 1024×1024 pixels. VLSI simulations with current technologies predict HDTV video rates. Data compression is among the applications that benefit from the new formulation of the transform.
TL;DR: A compact pyramidal representation of the input image for multiresolution analysis of the features extraction and classification components of the proposed system for the classification of pieces of wood used in the furniture industry is proposed.
Abstract: Texture is an important surface characteristic. Many industrial materials such as wood, textile, or paper are best characterized by their texture. Detection of defaults occurring on such materials or classification for quality control anD matching can be carried out through careful texture analysis. A system for the classification of pieces of wood used in the furniture industry is proposed. This paper is concerned with a neural network implementation of the features extraction and classification components of the proposed system. Texture appears differently depending at which spatial scale it is observed. A complete description of a texture thus implies an analysis at several spatial scales. We propose a compact pyramidal representation of the input image for multiresolution analysis. The feature extraction system is implemented on a multilayer artificial neural network. Each level of the pyramid, which is a representation of the input image at a given spatial resolution scale, is mapped into a layer of the neural network. A full resolution texture image is input at the base of the pyramid and a representation of the texture image at multiple resolutions is generated by the feedforward pyramid structure of the neural network. The receptive field of each neuron at a given pyramid level is preprogrammed as a discrete Gaussian low-pass filter. Meaningful characteristics of the textured image must be extracted if a good resolving power of the classifier must be achieved. Local dominant orientation is the principal feature which is extracted from the textured image. Local edge orientation is computed with a Sobel mask at four orientation angles (multiple of (pi) /4). The resulting intrinsic image, that is, the local dominant orientation image, is fed to the texture classification neural network. The classification network is a three-layer feedforward back-propagation neural network.
TL;DR: A bound on vote scattering has been derived which guides the image subdivision and the adaptive quantization of the parameter space and an accurate Hough transform of low ρ-scattering and high θ-precision has been achieved.
Abstract: This paper presents an accurate line extraction technique — the Hierarchical Peak Compaction Hough Transform (HPCHT). Vote scattering in the parameter space is a problem when the Hough transform is used for line extraction. This paper investigates the effects of image size and edge data errors on the severity of vote scattering. The HPCHT uses the Hough procedure on small subimages initially, and a recursive Hough merging scheme on the extracted line segments afterwards. A bound on vote scattering has been derived which guides the image subdivision and the adaptive quantization of the parameter space. As a result, an accurate Hough transform of low ρ-scattering and high θ-precision has been achieved. The HPCHT is suitable for fast parallel implementation on pyramid computers.
TL;DR: Some of the important applications and some of the recent fast growing fields like pyramid processing, image restoration, texture analysis, fractal analysis and image segmentation are described/present.
Abstract: Due to advances in the semiconductor technology and developments in software, digital image processing is also developing very rapidly. Here an attempt is made to describe/present some of the important applications and some of the recent fast growing fields like pyramid processing, image restoration, texture analysis, fractal analysis and image segmentation.
TL;DR: A new transformation is defined that converts circles in an image to a family of straight lines allowing the problem to be converted to line detection which can be solved by Hough transform algorithms.
TL;DR: The authors display the implementation of a component labeling algorithm onto a hyper-pyramid network with a computational complexity of O(log/sup 2/(n)).
Abstract: The authors describe a novel network topology for image processing, called the hyper-pyramid network topology. This structure is hierarchical and implements local, inside-region communications at each level, and upward/downward communications in the whole structure. Intraregion communications are shown by an image processing algorithm study. The authors display the implementation of a component labeling algorithm onto a hyper-pyramid network with a computational complexity of O(log/sup 2/(n)). This complexity is the same as that of the hypercube network. It is also demonstrated that the wiring complexity is less than that of the hypercube network. >
TL;DR: It is claimed that the operator representation in conjunction with some appropriate feature extraction algorithm is well suited as a general framework for defining multi level feature hierarchies.
Abstract: The topic of this report is signal representation in the context of hierarchical image processing. An overview of hierarchical processing systems is included as well as a presentation of various approaches to signal representation, feature representation and feature extraction. It is claimed that image hierarchies based on feature extraction, so called feature hierarchies, demand a signal representation other than the standard spatial or linear representation used today. A new representation, the operator representation is developed. It is based on an interpretation of features in terms of signal transformations. This representation has no references to any spatial ordering of the signal element and also gives an explicit representation of signal features. Using the operator representation, a generalization of the standard phase concept in image processing is introduced. Based on the operator representation, two algorithms for extraction of feature values are presented. Both have the capability of generating phase invariant feature descriptors. It is claimed that the operator representation in conjunction with some appropriate feature extraction algorithm is well suited as a general framework for defining multi level feature hierarchies. The report contains an appendical chapter containing the mathematical details necessary to comprehend the presentation.
TL;DR: In this paper mesh, pyramid, and mesh of pyramids implementations of an iterative image restoration algorithm are proposed, based on a single step regularized iterative restoration algorithm.
Abstract: In this paper mesh, pyramid, and mesh of pyramids implementations of an iterative image restoration algorithm are proposed. These implementations are based on a single step regularized iterative restoration algorithm. Area-time bounds on the proposed implementations are established. The efficiency of the proposed VLSI algorithms is evaluated by comparing the established bounds against lower bounds on AT2, where A is the area of the VLSI chip and T is its computation time.
TL;DR: A scheme for very low bit rate coding of images is presented, and entropy-constrained vector quantization (ECVQ) is adopted to quantize various subimages lying on the pyramid levels.
Abstract: A scheme for very low bit rate coding of images is presented. A pyramidal structure, constructed from the wavelet decomposition of the original image on successive dyadic resolutions, serves as the basic representation of the image. Entropy-constrained vector quantization (ECVQ) is adopted to quantize various subimages lying on the pyramid levels. Optimal entropy coding of the resulted indices can yield very low bit rates at appreciably high objective quality. >
TL;DR: In this paper, the singular value decomposition transform (SVDT) is used for intraframe, interframe and intersequence analysis for two image sequences resulting from a binocular angiographic system.
Abstract: The authors introduce the singular value decomposition transform and present an application to intraframe, interframe and intersequence analysis for two image sequences resulting from a binocular angiographic system. It is shown that the second singular value follows the heart's cyclic behavior, which allows two angiographic sequences to be postsynchronized without electrocardiogram signal recording. Pyramid wavelet transforms and their application to digital subtraction angiography images are addressed. A novel analyzing wavelet acting like a matched filter is discussed. >