TL;DR: The architecture and initial algorithms for Pixel-Planes 5, a heterogeneous multi-computer designed both for high-speed polygon and sphere rendering and for supporting algorithm and application research in interactive 3D graphics are introduced.
Abstract: This paper introduces the architecture and initial algorithms for Pixel-Planes 5, a heterogeneous multi-computer designed both for high-speed polygon and sphere rendering (1M Phong-shaded triangles/second) and for supporting algorithm and application research in interactive 3D graphics. Techniques are described for volume rendering at multiple frames per second, font generation directly from conic spline descriptions, and rapid calculation of radiosity form-factors. The hardware consists of up to 32 math-oriented processors, up to 16 rendering units, and a conventional 1280 × 1024-pixel frame buffer, interconnected by a 5 gigabit ring network. Each rendering unit consists of a 128 × 128-pixel array of processors-with-memory with parallel quadratic expression evaluation for every pixel. Implemented on 1.6 micron CMOS chips designed to run at 40MHz, this array has 208 bits/pixel on-chip and is connected to a video RAM memory system that provides 4,096 bits of off-chip memory. Rendering units can be independently reasigned to any part of the screen or to non-screen-oriented computation. As of April 1989, both hardware and software are still under construction, with initial system operation scheduled for fall 1989.
TL;DR: In this article, a grid-based photon mapping algorithm was proposed to simulate global illumination with progressive, interactive feedback to the user, which can be used to compute an estimate of the radiance at any surface location in the scene.
Abstract: We present a modified photon mapping algorithm capable of running entirely on GPUs. Our implementation uses breadth-first photon tracing to distribute photons using the GPU. The photons are stored in a grid-based photon map that is constructed directly on the graphics hardware using one of two methods: the first method is a multipass technique that uses fragment programs to directly sort the photons into a compact grid. The second method uses a single rendering pass combining a vertex program and the stencil buffer to route photons to their respective grid cells, producing an approximate photon map. We also present an efficient method for locating the nearest photons in the grid, which makes it possible to compute an estimate of the radiance at any surface location in the scene. Finally, we describe a breadth-first stochastic ray tracer that uses the photon map to simulate full global illumination directly on the graphics hardware. Our implementation demonstrates that current graphics hardware is capable of fully simulating global illumination with progressive, interactive feedback to the user.
TL;DR: In this paper, a graphics rendering chip serially renders a stream of geometric primitives to image regions called chunks, and a pixel engine performs hidden surface removal and controls storage of pixel and fragment records to the pixel buffer, respectively.
Abstract: A graphics rendering chip serially renders a stream of geometric primitives to image regions called chunks. A set-up processor in the chip parses rendering commands and the stream of geometric primitives and computes edge equation parameters. A scan-convert processor receives the edge equation parameters from the set-up processor and scan converts the geometric primitives to produce pixel records and fragment records. An internal, double-buffered pixel buffer stores pixel records for fully covered pixel addresses and also stores references to fragment lists stored in a fragment buffer. A pixel engine performs hidden surface removal and controls storage of pixel and fragment records to the pixel and fragment buffers, respectively. An anti-aliasing engine resolves pixel data for one pixel buffer while the pixel engine fills the other pixel buffer with pixel data for the next chunk.
TL;DR: In this paper, an array controller is used to clean buffer memory as a background task. But the buffer must first be zeroed or cleaned prior to the parallel operations, and the controller is more likely to find an appropriate size buffer of free and zeroed data sectors in the transfer buffer to perform the parallel logic operations.
Abstract: An array controller that cleans buffer memory as a background task. The controller includes a transfer buffer, a memory that stores an index or table indicating free and non-zero data sectors within the transfer buffer, and processing logic that uses the transfer buffer for data transfer operations, and when otherwise idle, that scans the index table for contiguous sections of free and non-zero data sectors of the transfer buffer and that zeroes at least one of the contiguous sections. The controller allocates buffer memory and performs parallel logic operations into the buffer, such as XOR logic operations to generate new parity data. The buffer must first be zeroed or cleaned prior to the parallel operations. With the background task, the controller is more likely to find an appropriate size buffer of free and zeroed data sectors in the transfer buffer to perform the parallel logic operations. The background task significantly reduces or relieves the controller from having to issue CDB-based memory commands to zero or clean an allocated buffer during disk I/O operations.
TL;DR: In this paper, an adaptive pixel multisampler (24) generates pixel data for display using an interlocking sub-pixel sampling pattern P and a frame buffer (26) organized as a per-polygon, per-pixel heap (64).
Abstract: An adaptive pixel multisampler (24) generates pixel data for display using an interlocking sub-pixel sampling pattern P and a frame buffer (26) organized as a per-polygon, per-pixel heap (64). The interlocking sampling pattern P provides the advantages of a multi-pixel shaped filter without pixel-to-pixel cross communication and without additional sub-pixels. The per-polygon, per-pixel heap (64) allocates frame buffer memory so that each pixel will have one set of data stored in the frame buffer (26) for every polygon that influences that pixel. This memory allocation scheme can significantly reduce frame buffer memory requirements. The polygon data is blended to properly handle processing of transparent polygons and polygon edges without the degradation of image quality found in conventional computer graphics systems.