Journal Article10.1109/2.299410
Parallel visualization algorithms: performance and architectural implications
180
TL;DR: This article demonstrates that simple and natural parallelizations work very well, the sequential implementations do not have to be fundamentally restructured, and the high degree of temporal locality obviates the need for explicit data distribution and communication management on the best known visualization algorithms.
read more
Abstract: Recently, a new class of scalable, shared-address-space multiprocessors has emerged. Like message-passing machines, these multiprocessors have a distributed interconnection network and physically distributed main memory. However, they provide hardware support for efficient implicit communication through a shared address space, and they automatically exploit temporal locality by caching both local and remote data in a processor's hardware cache. In this article, we show that these architectural characteristics make it much easier to obtain very good speedups on the best known visualization algorithms. Simple and natural parallelizations work very well, the sequential implementations do not have to be fundamentally restructured, and the high degree of temporal locality obviates the need for explicit data distribution and communication management. We demonstrate our claims through parallel versions of three state-of-the-art algorithms: a recent hierarchical radiosity algorithm by Hanrahan et al. (1991), a parallelized ray-casting volume renderer by Levoy (1992), and an optimized ray-tracer by Spach and Pulleyblank (1992). We also discuss a new shear-warp volume rendering algorithm that provides the first demonstration of interactive frame rates for a 256/spl times/256/spl times/256 voxel data set on a general-purpose multiprocessor. >
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
The SPLASH-2 programs: characterization and methodological considerations
Steven Cameron Woo,Moriyoshi Ohara,Evan Torrie,Jaswinder Pal Singh,Anoop Gupta +4 more
- 01 May 1995
TL;DR: This paper quantitatively characterize the SPLASH-2 programs in terms of fundamental properties and architectural interactions that are important to understand them well, including the computational load balance, communication to computation ratio and traffic needs, important working set sizes, and issues related to spatial locality.
Fast volume rendering using a shear-warp factorization of the viewing transformation
Philippe Lacroute,Marc Levoy +1 more
- 24 Jul 1994
TL;DR: A new object-order rendering algorithm based on the factorization of a shear-warp factorization for perspective viewing transformations is described that is significantly faster than published algorithms with minimal loss of image quality.
Thread Scheduling for Multiprogrammed Multiprocessors
TL;DR: This work presents a user-level thread scheduler for shared-memory multiprocessors, and it achieves linear speedup whenever P is small relative to the parallelism T1/T∈fty .
513
The SPLASH-2 programs
TL;DR: The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed shared-address-space multiprocessors.
489
Kendo: efficient deterministic multithreading in software
Marek Olszewski,Jason Ansel,Saman Amarasinghe +2 more
- 07 Mar 2009
TL;DR: Kendo is a new software-only system that provides deterministic multithreading of parallel applications that is easier to develop, debug, and test and can run on today's commodity hardware while incurring only a modest performance cost.
References
Fast volume rendering using a shear-warp factorization of the viewing transformation
Philippe Lacroute,Marc Levoy +1 more
- 24 Jul 1994
TL;DR: A new object-order rendering algorithm based on the factorization of a shear-warp factorization for perspective viewing transformations is described that is significantly faster than published algorithms with minimal loss of image quality.
A progressive refinement approach to fast radiosity image generation
Michael F. Cohen,Shenchang Eric Chen,John R. Wallace,Donald P. Greenberg +3 more
- 01 Jun 1988
TL;DR: A reformulated radiosity algorithm is presented that produces initial images in time linear to the number of patches, which brings the use of radiosity for interactive rendering within reach and has implications for the use and development of current and future graphics workstations.
689
A rapid hierarchical radiosity algorithm
Pat Hanrahan,David Salzman,Larry Aupperle +2 more
- 01 Jul 1991
TL;DR: Standard techniques for shooting and gathering can be used with the hierarchical representation to solve for equilibrium radiosities, but the paper also discusses using a brightness-weighted error criteria, in conjunction with multigridding, to even more rapidly progressively refine the image.
642
•Book
Multiprocessor simulation and tracing using Tango
Helen Davis,Stephen R. Goldschmidt,John L. Hennessy +2 more
- 01 Jan 1995
173
Volume rendering on scalable shared-memory MIMD architectures
Jason Nieh,Marc Levoy +1 more
- 01 Dec 1992
TL;DR: A parallel volume rendering algorithm for MIMD architectures based on ray tracing and a novel task queue image partitioning technique that achieves nearly linear speedups and near real-time frame update rates on a 48 processor machine.
151
Related Papers (5)
Pat Hanrahan,David Salzman,Larry Aupperle +2 more
- 01 Jul 1991
James Laudon,Daniel E. Lenoski +1 more
- 01 May 1997
Josh Barnes,Piet Hut +1 more