TL;DR: In this article, a scene graph is constructed from a number of objects stored in memory, which can be changed by subroutine calls, and a class hierarchy is defined to implement a retained mode graphics.
Abstract: A computer-readable medium having stored thereon an applications programming interface for causing a computer system to render a three-dimensional scene according to a downloaded file. A scene graph is constructed from a number of objects stored in memory. These objects have variables which can be changed by subroutine calls. Furthermore, one or more objects can contain one or more fields. A field is comprised of a data type which represents the state of an object. Engines are used to perform defined functions to the fields. One or more routes can be used to change one field in response to changes made to another field. A class hierarchy is defined to implement a retained mode graphics.
TL;DR: This paper describes parallel implementations of the Aura API which provides a multi-platform graphics interface using a retained mode paradigm and shows that a retained approach performs much better than existing implementations.
Abstract: Most research in parallel rendering has focused on widely used immediate mode interfaces (mainly OpenGL) or on retained mode interfaces designed for parallel rendering only. In this paper, we describe parallel implementations of the Aura API which provides a multi-platform graphics interface using a retained mode paradigm. Using several test applications in the context of a tiled display infrastructure, we show that a retained approach performs much better than existing implementations.
TL;DR: This paper will discuss several potential areas of change in 3D graphics applications on the PC, including pixel shading, texturing, and fill rates rise, and the most constrained bottleneck in the system will increasingly become creation and transfer of geometry information.
Abstract: In the late 1990’s, graphics hardware is experiencing a dramatic board-to-chip integration reminiscent to the minicomputer-to-microprocessor revolution of the 1980’s. Today, mass-market PCs are beginning to match the 3D polygon and pixel rendering of a 1992 Silicon Graphics Reality EngineTM system. The extreme pace of technology evolution in the PC market is such that within 1 or 2 years the performance of a mainstream PC will be very close to the highest performance 3D workstations. At that time, the quality and performance demands will dictate serious changes in PC architecture as well as changes in rendering pipeline and algorithms. This paper will discuss several potential areas of change. A GENERAL PROBLEM STATEMENT The biggest focus of 3D graphics applications on the PC is interactive entertainment, or games. This workload is extremely dynamic, with continuous updating of geometry, textures, animation, lighting, and shading. Although in other applications such as Computer-AidedDesign (CAD), models may be static and retained mode or display list APIs may be used, it is common in games that geometry and textures change regularly. A good operating assumption is that everything changes every frame. The assumption of pervasive change puts a large burden on both the bandwidth and calculation capabilities of the graphics pipeline. GEOMETRY AND PIXEL THROUGHPUT As a baseline, we’ll start with some data and cycle counting of a reasonable workload for an interactive application. PC graphics hardware is capable of this throughput. As an example, this is a bandwidth analysis of a 400 MHz Intel Pentium IITM PC with an Nvidia RNA TNTTM graphics processor. This analysis does not derive from a specific application, but is simply a counting exercise. Many applications push one or more of these limits, but few programs stress all axes. For a typical application to achieve 1M triangles/second, 1 OOM 32bit pixels/second, 2 textures/pixel requires: 1 M triangles * 3 vertices/triangle * 32 bytes/vertex = 100 MB; triangle data crosses the bus 3-5 times (read, transform and written by the CPU, and read by the graphics processor, so simply copying triangle data requires 300-500 MB/second on the PC buses. 1OOM pixels * 8 bytes/pixel (32bit RGBA, 32bit Z/stencil) = 800 MB; with 50% overhead for RMW requires 1.2 GB/second 2 textures/pixel * 4 texelsltexture * 2 bytee a texture cache can create up to 4X reuse efficiency, so requires 400 MB/second Assumptions here include: 32-byte vertices are Direct3DTM TLVertices (X,Y,Z,R,G,B,A,F,SR,SG,SB,W) triangle setup is done on the graphics processor bilinear texture filtering 16bit texels are RSG6B5 50% of pixels written after Zbuffer read/compare Transferring triangle vertex data to the graphics processor from the CPU is commonly the bottleneck. This is different from typical workstations or the PCs of just 1 year ago, when transform and lighting calculation, fill rate, or texture rate were limiting factors. GEOMETRY REPRESENTATION As pixel shading, texturing, and fill rates rise, the most constrained bottleneck in the system will increasingly become creation and transfer of geometry information. The data required to represent a triangle comprises the bulk of system bus traffic in an aggressive 3D application. As
TL;DR: In this article, an engine and API that couples an application program to an effects program such as a magnification program is described, where source content may be magnified for viewing in an output region, and when graphics commands corresponding to that region are received, the graphics commands are processed to show a transformed representation of the region.
Abstract: Described is an engine and API that couples an application program to an effects program such as a magnification program. For example, source content may be magnified for viewing in an output region. Magnification may be accomplished by identifying a magnification window to the magnification engine, a source region to magnify, a magnification transform, and possibly filtering criteria, such as any windows to include or exclude from magnification. A request to display a region of displayed graphics as modified by a transform may be received, and when graphics commands corresponding to that region are received, the graphics commands are processed to show a transformed representation of the region. The engine and API may work with immediate mode graphics primitives (e.g., GDI commands) and retained mode graphics primitives (e.g., primitives corresponding to a rendering tree; a composition engine composes the output, including any magnified output.
TL;DR: A proper application of the CMR technique is one of the most important success factors in today’s analysis quality in the advanced product development process across many industries and the mode cut-off number should be considered to be a measure of analysis quality.
Abstract: Dynamic analysis of very large and complicated FE structures such as FE full-vehicle structures are mainly performed with synthesized, component mode-reduced sub-models with common interfaces. The theory of component mode reduction (CMR) is well known, but it is not a simple application in real-live analyses. One of its problems, discussed in this paper, has not appeared in literature or in commercial software releases and their reference guides yet, although, in advanced computer-aided engineering, most dynamic analysts are confronted with it. The problem is related to the mode cut-off number in CMR and its enormous influence on the components reduced representation and the response solution. The mode cut-off number is the number of retained mode shapes from the components in CMR and the frequency corresponding to the highest mode is called the cut-off frequency. Ultimately the response quantities in excited vibrations, predicted from the reduced order model, are only as good as the component modes and the system modes. A proper application of the CMR technique is one of the most important success factors in today’s analysis quality in the advanced product development process across many industries. Therefore, the mode cut-off number should be considered to be a measure of analysis quality. The paper illustrates the effect of the mode cut-off number in an example from automotive industry: CMR of a body-in-white component and system mode computation of the reduced system. A list of guide lines concludes the discussion and some proposals for a stable analysis process and optimal performance are also given.