Arbitrary precision arithmetic-SIMD style
S. Balakrishnan,S. K. Nandy +1 more
- 04 Jan 1998
- pp 128-132
TL;DR: This paper motivates the need for arbitrary precision packed arithmetic wherein the width of the sub-datatypes are programmable by the user and proposes an implementation for arithmetic on such packed datatypes.
read more
Abstract: Current day general purpose processors have been enhanced with what is called "media instruction set" to achieve performance gains in applications that are media processing intensive. The instruction set that has been added exploits the fact that media applications have small native datatypes and have widths much less than that supported by commercial processors and the plethora of data-parallelism in such applications. Current processors enhanced with the "media instruction set" support arithmetic on sub-datatypes of only 8-bit, 16-bit, 32-bit and 64-bit precision. In this paper we motivate the need for arbitrary precision packed arithmetic wherein the width of the sub-datatypes are programmable by the user and propose an implementation for arithmetic on such packed datatypes. The proposed scheme has marginal hardware overhead over conventional implementations of arithmetic on processors incorporating a multimedia extended instruction set.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Overview of research efforts on media ISA extensions and their usage in video coding
TL;DR: This paper summarizes the results of over 25 research groups or individual researchers that have presented video coding implementations on general-purpose processors with the new single instruction multiple data media instruction set architecture extensions and offers an overview of future trends for new instructions and architectural speed-up techniques.
41
Bringing High-Performance Reconfigurable Computing to Exact Computations
E. Ej-Araby,Ivan Gonzalez,Tarek El-Ghazawi +2 more
- 12 Nov 2007
TL;DR: This paper presents a first effort, to the best of the knowledge, of reconfigurable hardware support for arbitrary-precision arithmetic based on virtual convolution scheduling which is derived from a formal representation of the problem.
26
Reconfigurable Filter Coprocessor Architecture for DSP Applications
S. Ramanathan,S. K. Nandy,V. Visvanathan +2 more
- 01 Nov 2000
TL;DR: This paper presents a reconfigurable high-performance low-power filter coprocessor architecture for DSP applications that can be reconfigured to support a wide variety of filtering computations.
24
More on arbitrary boundary packed arithmetic
P. Karthikeyan,P. Ranganathan +1 more
- 17 Dec 1998
TL;DR: A scheme based on Wallace tree multiplication for arbitrary boundary packed multiplication of media algorithms based on multiply-accumulate algorithms is presented and the intermediate carries of sub-datatypes which were lost in the previous work are provided.
Performance evaluation of multithreaded architectures for media processing applications
S. Balakrishnan,S. K. Nandy +1 more
- 28 May 2000
TL;DR: It is shown that multithreaded architectures coupled with SIMD parallelism provides performance improvement in excess of 2/spl times/ over conventional superscalar architectures.
References
•Book
An introduction to parallel algorithms
Joseph JaJa
- 01 Oct 1992
TL;DR: This book provides an introduction to the design and analysis of parallel algorithms, with the emphasis on the application of the PRAM model of parallel computation, with all its variants, to algorithm analysis.
Parallel Prefix Computation
TL;DR: A recurstve construction is used to obtain a product circuit for solving the prefix problem and a Boolean clrcmt which has depth 2[Iog2n] + 2 and size bounded by 14n is obtained for n-bit binary addmon.
A Regular Layout for Parallel Adders
TL;DR: It is shown that addition of n-bit binary numbers can be performed on a chip with a regular layout in time proportional to log n and with area proportional to n.
•Book
Introduction to Parallel Algorithms
C. Xavier,S. S. Iyengar +1 more
- 05 Aug 1998
TL;DR: Algorithms for Parallel Computing: Algebraic Equations and Matrices, Differentiation and Integration, and Tree Algorithms.
1K
VIS speeds new media processing
TL;DR: UltraSparc's Visual Instruction Set, described here in detail, accelerates some widely used media-processing algorithms by as much as seven times.
282
Related Papers (5)
James E. Smith,Greg Faanes,Rabin Sugumar +2 more
- 01 May 2000
Tobias Vejda,Daniel Page,Johann Großschädl +2 more
- 02 Jul 2007
Krishna V. Palem,Surendranath Talla +1 more
- 01 Jan 2001