Abstract: An architecture is presented for fast high-quality rendering of complex images. All objects are reduced to common world-space geometric entities called micropolygons, and all of the shading and visibility calculations operate on these micropolygons. Each type of calculation is performed in a coordinate system that is natural for that type of calculation. Micropolygons are created and textured in the local coordinate sysem of the object, with the result that texture filtering is simplified and improved. Visibility is calculated in screen space using stochastic point sampling with a z buffer. There are no clipping or inverse perspective calculations. Geometric and texture locality are exploited to minimize paging and to support models that contain arbitrarily many primitives.
TL;DR: A generalized approach to decoupled shading from visibility sampling in graphics pipelines, which is inspired by the Reyes rendering architecture and can be thought of as a generalization of multisample antialiasing to support complex and dynamic mappings from visibility to shading samples, as introduced by motion and defocus blur and adaptive shading.
Abstract: We propose a generalized approach to decoupling shading from visibility sampling in graphics pipelines, which we call decoupled sampling. Decoupled sampling enables stochastic supersampling of motion and defocus blur at reduced shading cost, as well as controllable or adaptive shading rates which trade off shading quality for performance. It can be thought of as a generalization of multisample antialiasing (MSAA) to support complex and dynamic mappings from visibility to shading samples, as introduced by motion and defocus blur and adaptive shading. It works by defining a many-to-one hash from visibility to shading samples, and using a buffer to memoize shading samples and exploit reuse across visibility samples. Decoupled sampling is inspired by the Reyes rendering architecture, but like traditional graphics pipelines, it shades fragments rather than micropolygon vertices, decoupling shading from the geometry sampling rate. Also unlike Reyes, decoupled sampling only shades fragments after precise computation of visibility, reducing overshading.We present extensions of two modern graphics pipelines to support decoupled sampling: a GPU-style sort-last fragment architecture, and a Larrabee-style sort-middle pipeline. We study the architectural implications of decoupled sampling and blur, and derive end-to-end performance estimates on real applications through an instrumented functional simulator. We demonstrate high-quality motion and defocus blur, as well as variable and adaptive shading rates.
TL;DR: This work presents RenderAnts, the first system that enables interactive Reyes rendering on GPUs and proposes a multi-GPU scheduling technique based on work stealing so that the system can support scalable rendering on multiple GPUs.
Abstract: We present RenderAnts, the first system that enables interactive Reyes rendering on GPUs. Taking RenderMan scenes and shaders as input, our system first compiles RenderMan shaders to GPU shaders. Then all stages of the basic Reyes pipeline, including bounding/splitting, dicing, shading, sampling, compositing and filtering, are executed on GPUs using carefully designed data-parallel algorithms. Advanced effects such as shadows, motion blur and depth-of-field can also be rendered. In order to avoid exhausting GPU memory, we introduce a novel dynamic scheduling algorithm to bound the memory consumption during rendering. The algorithm automatically adjusts the amount of data being processed in parallel at each stage so that all data can be maintained in the available GPU memory. This allows our system to maximize the parallelism in all individual stages of the pipeline and achieve superior performance. We also propose a multi-GPU scheduling technique based on work stealing so that the system can support scalable rendering on multiple GPUs. The scheduler is designed to minimize inter-GPU communication and balance workloads among GPUs.We demonstrate the potential of RenderAnts using several complex RenderMan scenes and an open source movie entitled Elephants Dream. Compared to Pixar's PRMan, our system can generate images of comparably high quality, but is over one order of magnitude faster. For moderately complex scenes, the system allows the user to change the viewpoint, lights and materials while producing photorealistic results at interactive speed.
TL;DR: A full architecture built around spectral light transport and a flexible implementation of multiple importance sampling is described, resulting in a system able to support a comparable amount of extensibility to what made the reyes rendering architecture successful over many decades.
Abstract: The Manuka rendering architecture has been designed in the spirit of the classic reyes rendering architecture: to enable the creation of visually rich computer generated imagery for visual effects in movie production. Following in the footsteps of reyes over the past 30 years, this means supporting extremely complex geometry, texturing, and shading. In the current generation of renderers, it is essential to support very accurate global illumination as a means to naturally tie together different assets in a picture. This is commonly achieved with Monte Carlo path tracing, using a paradigm often called shade on hit, in which the renderer alternates tracing rays with running shaders on the various ray hits. The shaders take the role of generating the inputs of the local material structure, which is then used by path-sampling logic to evaluate contributions and to inform what further rays to cast through the scene. We propose a shade before hit paradigm instead and minimise I/O strain on the system, leveraging locality of reference by running pattern generation shaders before we execute light transport simulation by path sampling. We describe a full architecture built around this approach, featuring spectral light transport and a flexible implementation of multiple importance sampling (mis), resulting in a system able to support a comparable amount of extensibility to what made the reyes rendering architecture successful over many decades.
TL;DR: In this paper, the authors compare the differences between the Reyes pipeline and the OpenGL pipeline on the Imagine stream processor, a high-performance programmable processor for media applications, and demonstrate the applicability of Reyes for hardware implementation and expose many issues that architects will face in implementing Reyes in hardware.
Abstract: The OpenGL and Reyes rendering pipelines each render complex scenes from similar scene descriptions but differ in their internal pipeline organizations. While the OpenGL organization has dominated hardware architectures over the past twenty years, a Reyes organization differs in several important ways from OpenGL, including a shader coordinate system that supports coherent texture accesses, a single shader in the vertex stage, and tessellation and sampling instead of triangle rasterization.Hardware for the OpenGL pipeline has been well-studied, but the lack of a hardware Reyes implementation has prevented a comparison between the two pipelines. We analyze and compare implementations of an OpenGL and a Reyes pipeline on the Imagine stream processor, a high performance programmable processor for media applications. This comparison both demonstrates the applicability of Reyes for hardware implementation and exposes many issues that architects will face in implementing Reyes in hardware, in particular the need for efficient subdivision algorithms and implementations.