Top 198 papers published in the topic of Multiplication in 2001

Showing papers on "Multiplication published in 2001"

Stochastic neural computation. I. Computational elements

[...]

B.D. Brown¹, Howard C. Card²•Institutions (2)

University of Winnipeg¹, University of Manitoba²

01 Sep 2001-IEEE Transactions on Computers

TL;DR: The primary contribution of this paper is in introducing several state machine-based computational elements for performing sigmoid nonlinearity mappings, linear gain, and exponentiation functions, and describing an efficient method for the generation of, and conversion between, stochastic and deterministic binary signals.

...read moreread less

Abstract: This paper examines a number of stochastic computational elements employed in artificial neural networks, several of which are introduced for the first time, together with an analysis of their operation. We briefly include multiplication, squaring, addition, subtraction, and division circuits in both unipolar and bipolar formats, the principles of which are well-known, at least for unipolar signals. We have introduced several modifications to improve the speed of the division operation. The primary contribution of this paper, however, is in introducing several state machine-based computational elements for performing sigmoid nonlinearity mappings, linear gain, and exponentiation functions. We also describe an efficient method for the generation of, and conversion between, stochastic and deterministic binary signals. The validity of the present approach is demonstrated in a companion paper through a sample application, the recognition of noisy optical characters using soft competitive learning. Network generalization capabilities of the stochastic network maintain a squared error within 10 percent of that of a floating-point implementation for a wide range of noise levels. While the accuracy of stochastic computation may not compare favorably with more conventional binary radix-based computation, the low circuit area, power, and speed characteristics may, in certain situations, make them attractive for VLSI implementation of artificial neural networks.

...read moreread less

580 citations

Journal Article•10.1145/502102.502106•

Interval arithmetic: From principles to implementation

[...]

Timothy J. Hickey¹, Q. Ju¹, M. H. van Emden²•Institutions (2)

Brandeis University¹, University of Victoria²

01 Sep 2001-Journal of the ACM

TL;DR: It is shown that the IEEE standard's specification of operations involving the signed infinities, signed zeros, and the exact/inexact flag are such as to make a correct and optimal implementation more efficient.

...read moreread less

Abstract: We start with a mathematical definition of a real interval as a closed, connected set of reals. Interval arithmetic operations (addition, subtraction, multiplication, and division) are likewise defined mathematically and we provide algorithms for computing these operations assuming exact real arithmetic. Next, we define interval arithmetic operations on intervals with IEEE 754 floating point endpoints to be sound and optimal approximations of the real interval operations and we show that the IEEE standard's specification of operations involving the signed infinities, signed zeros, and the exact/inexact flag are such as to make a correct and optimal implementation more efficient. From the resulting theorems, we derive data that are sufficiently detailed to convert directly to a program for efficiently implementing the interval operations. Finally, we extend these results to the case of general intervals, which are defined as connected sets of reals that are not necessarily closed.

...read moreread less

399 citations

Journal Article•10.1007/S001450010012•

Efficient Arithmetic in Finite Field Extensions with Application in Elliptic Curve Cryptography

[...]

Daniel V. Bailey¹, Christof Paar¹•Institutions (1)

Worcester Polytechnic Institute¹

01 Jan 2001-Journal of Cryptology

TL;DR: Results show that OEFs when used with the new inversion and multiplication algorithms provide a substantial performance increase over other reported methods.

...read moreread less

Abstract: This contribution focuses on a class of Galois field used to achieve fast finite field arithmetic which we call an Optimal Extension Field (OEF), first introduced in [3]. We extend this work by presenting an adaptation of Itoh and Tsujii's algorithm for finite field inversion applied to OEFs. In particular, we use the facts that the action of the Frobenius map in GF (pm) can be computed with only m-1 subfield multiplications and that inverses in GF (p) may be computed cheaply using known techniques. As a result, we show that one extension field inversion can be computed with a logarithmic number of extension field multiplications. In addition, we provide new extension field multiplication formulas which give a performance increase. Further, we provide an OEF construction algorithm together with tables of Type I and Type II OEFs along with statistics on the number of pseudo-Mersenne primes and OEFs. We apply this new work to provide implementation results using these methods to construct elliptic curve cryptosystems on both DEC Alpha workstations and Pentium-class PCs. These results show that OEFs when used with our new inversion and multiplication algorithms provide a substantial performance increase over other reported methods.

...read moreread less

153 citations

Book•

Young Mathematicians at Work: Constructing Multiplication and Division

[...]

Catherine Twomey Fosnot, Maarten Dolk

1 Sep 2001

117 citations

Posted Content•

An Integer Commitment Scheme based on Groups with Hidden Order.

[...]

Ivan Damgård, Eiichiro Fujisaki

01 Jan 2001-IACR Cryptology ePrint Archive

TL;DR: This work presents a commitment scheme allowing commitment to arbitrary size integers, based on any Abelian group with certain properties, most importantly that it is hard for the committer to compute its order.

...read moreread less

Abstract: We present a commitment scheme allowing commitment to arbitrary size integers, based on any Abelian group with certain properties, most importantly that it is hard for the committer to compute its order. Potential examples include RSA and class groups. We also give efficient zero-knowledge protocols for proving knowledge of the contents of a commitment and for verifying multiplicative relations over the integers on committed values. This means that our scheme can support, for instance, the efficent interval proofs of Boudot[1]. The scheme can be seen as a modification and a generalization of an earlier scheme of Fujisaki and Okamoto [5], and in particular our results show that we can use a much larger class of RSA moduli than the safe prime products proposed in [5]. Also, we correct some mistakes in the proofs of [5] and give what appears to be the first multiplication protocol for a Fujisaki/Okamoto-like scheme with a complete proof of soundness.

...read moreread less

115 citations

Journal Article•10.1109/78.960425•

Integer DCTs and fast algorithms

[...]

Yonghong Zeng¹, Lizhi Cheng², Guoan Bi¹, Alex C. Kot¹•Institutions (2)

Nanyang Technological University¹, National University of Defense Technology²

01 Nov 2001-IEEE Transactions on Signal Processing

TL;DR: A two-dimensional (2-D) integer discrete cosine transform is proposed, which needs only integer operations and shifts and is nonseparable and requires a far fewer number of operations than that used by the corresponding row-column 2-D integer discrete Cosine transform.

...read moreread less

Abstract: A method is proposed to factor the type-II discrete cosine transform (DCT-II) into lifting steps and additions. After approximating the lifting matrices, we get a new type-II integer discrete cosine transform (IntDCT-II) that is float-point multiplication free. Based on the relationships among the various types of DCTs, we can generally factor any DCTs into lifting steps and additions and then get four types of integer DCTs, which need no float-point multiplications. By combining the polynomial transform and the one-dimensional (1-D) integer cosine transform, a two-dimensional (2-D) integer discrete cosine transform is proposed. The proposed transform needs only integer operations and shifts. Furthermore, it is nonseparable and requires a far fewer number of operations than that used by the corresponding row-column 2-D integer discrete cosine transform.

...read moreread less

92 citations

Algorithmic Manipulations and Transformations of Univariate Holonomic Functions and Sequences

[...]

Christian Mallinger

1 Jan 2001

TL;DR: A package that contains procedures for automatic manipulations and transformations of univariate holonomic functions and sequences within the computer algebra system Mathematica is implemented and some different techniques for proving holonomic identities are described.

...read moreread less

Abstract: Holonomic functions and sequences have the property that they can be represented by a finite amount of information. Moreover, these holonomic objects are closed under elementary operations like, for instance, addition or (termwise and Cauchy) multiplication. These (and other) operations can also be performed “algorithmically”. As a consequence, we can prove any identity of holonomic functions or sequences automatically. Based on this theory, the author implemented a package that contains procedures for automatic manipulations and transformations of univariate holonomic functions and sequences within the computer algebra system Mathematica . This package is introduced in detail. In addition, we describe some different techniques for proving holonomic identities.

...read moreread less

90 citations

Patent•

Parallel counter and a multiplication logic circuit

[...]

Dmitriy Rumynin, Sunil Talwar, Peter Meulemans

25 Jan 2001

TL;DR: In this paper, a parallel counter comprises logic for generating output bits as symmetrical functions of the input bits and a multiplication circuit is also provided in which an array of combinations of each bit of a binary number with each other bit of another binary number is generated having a reduced form.

...read moreread less

Abstract: A parallel counter comprises logic for generating output bits as symmetrical functions of the input bits. The parallel counter can be used in a multiplication circuit. A multiplication circuit is also provided in which an array of combinations of each bit of a binary number with each other bit of another binary number is generated having a reduced form in order to reduce the steps required in array reduction.

...read moreread less

70 citations

Journal Article•

Secure distributed linear algebra in a constant number of rounds

[...]

Ronald Cramer¹, Ivan Damgård¹•Institutions (1)

National Research Foundation of South Africa¹

01 Jan 2001-Lecture Notes in Computer Science

TL;DR: In this article, the authors consider a network of processors among which elements in a finite field K can be verifiably shared in a constant number of rounds, and show how the network can securely, efficiently and in constant-round compute determinant, characteristic polynomial, rank, and the solution space of linear systems of equations.

...read moreread less

Abstract: Consider a network of processors among which elements in a finite field K can be verifiably shared in a constant number of rounds. Assume furthermore constant-round protocols are available for generating random shared values, for secure multiplication and for addition of shared values. These requirements can be met by known techniques in all standard models of communication. In this model we construct protocols allowing the network to securely solve standard computational problems in linear algebra. In particular, we show how the network can securely, efficiently and in constant-round compute determinant, characteristic polynomial, rank, and the solution space of linear systems of equations. Constant round solutions follow for all problems which can be solved by direct application of such linear algebraic methods, such as deciding whether a graph contains a perfect match. If the basic protocols (for shared random values, addition and multiplication) we start from are unconditionally secure, then so are our protocols. Our results offer solutions that are significantly more efficient than previous techniques for secure linear algebra, they work for arbitrary fields and therefore extend the class of functions previously known to be computable in constant round and with unconditional security. In particular, we obtain an unconditionally secure protocol for computing a function f in constant round, where the protocol has complexity polynomial in the span program size of f over an arbitrary finite field.

...read moreread less

70 citations

Patent•

Multipler unit in reconfigurable chip

[...]

Gary Lai, Joshua Lindner

18 Sep 2001

TL;DR: In this paper, a multiplication block for a reconfigurable chip includes multiple multiplication units and a group of the selectable adder units operably interconnectable with the multiplication units.

...read moreread less

Abstract: A multiplication block for a reconfigurable chip includes multiple multiplication units and a group of the selectable adder units operably interconnectable with the multiplication units. The adder units can be selectively connected for different configurations. The multiplication block is preferably controlled by an instruction which can put the multiplication block into different configurations.

...read moreread less

68 citations

Journal Article•10.3758/BF03196397•

The representations of the arithmetic operations include functional relationships

[...]

James A. Dixon¹, Julie K. Deets², Ashley S. Bangert¹•Institutions (2)

College of William & Mary¹, Trinity University²

01 Apr 2001-Memory & Cognition

TL;DR: Testing whether people represent analogous principles for each arithmetic operation showed that operations with longer developmental histories had strong principle representations, and the implications for a structure-mapping approach to mathematical problem solving are discussed.

...read moreread less

Abstract: Current theories of mathematical problem solving propose that people select a mathematical operation as the solution to a problem on the basis of a structure mapping between their problem representation and the representation of the mathematical operations. The structure-mapping hypothesis requires that the problem and the mathematical representations contain analogous relations. Past research has demonstrated that the problem representation consists of functional relationships, orprinciples. The present study tested whether people represent analogous principles for each arithmetic operation (i.e., addition, subtraction, multiplication, and division). For each operation, college (Experiments 1 and 2) and 8th grade (Experiment 2) participants were asked to rate the degree to which a set of completed problems was a good attempt at the operation. The pattern of presented answers either violated one of four principles or did not violate any principles. The distance of the presented answers from the correct answers was independently manipulated. Consistent with the hypothesis that people represent the principles, (1) violations of the principles were rated as poorer attempts at the operation, (2) operations that are learned first (e.g., addition) had more extensive principle representations than did operations learned later (multiplication), and (3) principles that are more frequently in evidence developed more quickly. In Experiment 3, college participants rated the degree to which statements were indicative of each operation. The statements were either consistent or inconsistent with one of two principles. The participants’ ratings showed that operations with longer developmental histories had strong principle representations. The implications for a structure-mapping approach to mathematical problem solving are discussed.

...read moreread less

Journal Article•10.1016/S0167-8191(01)00073-4•

Towards a fast parallel sparse symmetric matrix-vector multiplication

[...]

Roman Geus, Stefan Röllin

31 May 2001

TL;DR: This paper analyzes the performance of the sparse matrix–vector product with symmetric matrices originating from the FEM and describes techniques that lead to a fast implementation.

...read moreread less

Abstract: The sparse matrix–vector product is an important computational kernel that runs ineffectively on many computers with super-scalar RISC processors. In this paper we analyse the performance of the sparse matrix–vector product with symmetric matrices originating from the FEM and describe techniques that lead to a fast implementation. It is shown how these optimisations can be incorporated into an efficient parallel implementation using message-passing. We conduct numerical experiments on many different machines and show that our optimisations speed up the sparse matrix–vector multiplication substantially.

...read moreread less

Multiplication by an Integer Constant

[...]

Vincent Lefèvre

1 Jan 2001

TL;DR: This work presents and compares various algorithms, including a new one, allowing to perform multiplications by integer constants using elementary operations, useful, as they occur in several problems, such as the Toom-Cook-like algorithms to multiply large multiple-precision integers.

...read moreread less

Abstract: We present and compare various algorithms, including a new one, allowing to perform multiplications by integer constants using elementary operations Such algorithms are useful, as they occur in several problems, such as the Toom-Cook-like algorithms to multiply large multiple-precision integers, the approximate computation of consecutive values of a polynomial, and the generation of integer multiplications by compilers

...read moreread less

Book Chapter•10.1007/3-540-44586-2_27•

Efficient Implementation of Elliptic Curve Cryptosystems on the TI MSP 430x33x Family of Microcontrollers

[...]

Jorge Guajardo¹, Rainer Blümel, Uwe Krieger, Christof Paar¹•Institutions (1)

Worcester Polytechnic Institute¹

13 Feb 2001

TL;DR: This contribution describes a methodology used to efficiently implement elliptic curves (EC) over GF(p) on the 16-bit TI MSP430x33x family of low-cost microcontrollers.

...read moreread less

Abstract: This contribution describes a methodology used to efficiently implement elliptic curves (EC) over GF(p) on the 16-bit TI MSP430x33x family of low-cost microcontrollers. We show that it is possible to implement EC cryptosystems in highly constrained embedded systems and still obtain acceptable performance at low cost. We modified the EC point addition and doubling formulae to reduce the number of intermediate variables while at the same time allowingfor flexibility. We used a Generalized-Mersenne prime to implement the arithmetic in the underlying field. We take advantage of the special form of the moduli to minimize the number of precomputations needed to implement inversion via Fermat's Little theorem and the k-ary method of exponentiation. We apply these ideas to an implementation of an elliptic curve system over GF(p), where p = 2128 - 297 - 1. We show that a scalar point multiplication can be achieved in 3.4 seconds without any stored/precomputed values and the processor clocked at 1 MHz.

...read moreread less

Journal Article•10.1109/4.953482•

A carry-free 54b/spl times/54b multiplier using equivalent bit conversion algorithm

[...]

Yun Kim¹, Bang-Sup Song, J. Grosspietsch, S.F. Gillig•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Oct 2001-IEEE Journal of Solid-state Circuits

TL;DR: An equivalent bit conversion algorithm (EBCA) is proposed to eliminate the need for final carry propagation in the redundant binary to normal binary (NB) conversion step for RB multiplication, and the entire multiplication process can be made free of carry propagation from input to output.

...read moreread less

Abstract: An equivalent bit conversion algorithm (EBCA) is proposed to eliminate the need for final carry propagation in the redundant binary (RB) to normal binary (NB) conversion step for RB multiplication. The multiplication process helps with the carry-free conversion step by eliminating certain combinations of RB product. When the EBCA is applied, conventional power-consuming carry-propagating adders are replaced by simple, minimum-sized carry-free converters, and the entire multiplication process can be made free of carry propagation from input to output. The method employed in this work reduces 40% of the total power and 30% of the total multiplication time in the final adder stage of traditional multipliers. The prototype fabricated in 0.35-/spl mu/m CMOS demonstrates that the 54 b/spl times/54 b multiplier consumes only 53.4 mW at 3.3 V for 74-MHz operation.

...read moreread less

Journal Article•10.1023/A:1011470131086•

Multiplication Distributivity of Proper and Improper Intervals

[...]

Evgenija D. Popova¹•Institutions (1)

Bulgarian Academy of Sciences¹

01 Apr 2001-Reliable Computing

TL;DR: This paper summarizes and presents all distributive relations, known by now, on multiplication and addition of generalized (proper and improper) intervals.

...read moreread less

Abstract: The arithmetic on an extended set of proper and improper intervals presents algebraic completion of the conventional interval arithmetic allowing thus efficient solution of some interval algebraic problems. In this paper we summarize and present all distributive relations, known by now, on multiplication and addition of generalized (proper and improper) intervals.

...read moreread less

Book Chapter•10.1007/978-1-4613-0257-5_6•

Experiments and Discoveries in q-Trigonometry

[...]

R. Wm. Gosper

1 Jan 2001

TL;DR: The authors introduced a q-generalization of sine and cosine functions, related to the ϑ functions, but possessing addition and multiplication formulas more analogous to those of ordinary sin and cos.

...read moreread less

Abstract: We introduce a q-generalization of the sine and cosine functions, related to the ϑ functions, but (as revealed by computer experiments) possessing addition and multiplication formulas more analogous to those of ordinary sin and cos These formulas then contribute identities to ϑ theory, and hint of a more natural formulation of ϑ functions as outgrowths of elementary functions Nevertheless, this paper can be read without knowledge of ϑ functions—it was certainly written that way

...read moreread less

Journal Article•

Accelerating matrix product on reconfigurable hardware for signal processing

[...]

Abbes Amira, Ahmed Bouridane, Peter Milligan

01 Jan 2001-Lecture Notes in Computer Science

TL;DR: In this article, the authors investigated how some of the new features of the Xilinx Virtex FPGA may be used to support efficient and optimised implementation of matrix product based on Multiply and Accumulate (MAC) such operations are frequently used in signal applications.

...read moreread less

Abstract: This paper investigates how some of the new features of the Xilinx Virtex FPGA may be used to support efficient and optimised implementation of matrix product based on Multiply and Accumulate (MAC) such operations are frequently used in signal applications. The principle new features that have been investigated are the Block RAM and the fully digital Delay-Locked Loop (DLL) The approach used for the matrix multiplication algorithm employs the idea used in the modified Booth encoder multiplication using Wallace Trees addition. Preliminary performance results and comparisons with similar algorithms implemented on multi-FPGA platforms have shown better performance for the proposed architecture.

...read moreread less

Patent•

Method and apparatus for rounding in a multiplier arithmetic

[...]

Stuart F. Oberman¹, Norbert Juffa¹, Ming Siu¹, Frederick D. Weber¹, Ravikrishna Cherukuri¹ - Show less +1 more•Institutions (1)

Advanced Micro Devices¹

12 Feb 2001

TL;DR: In this article, a multiplier capable of performing signed and unsigned scalar and vector multiplication is disclosed, where the multiplier is configured to receive signed or unsigned multiplier and multiplicand operands in scalar or packed vector form.

...read moreread less

Abstract: A multiplier capable of performing signed and unsigned scalar and vector multiplication is disclosed. The multiplier is configured to receive signed or unsigned multiplier and multiplicand operands in scalar or packed vector form. An effective sign for the multiplier and multiplicand operands may be calculated and used to create and select a number of partial products according to Booth's algorithm. Once the partial products have been created and selected, they may be summed and the results may be output. The results may be signed or unsigned, and may represent vector or scalar quantities. When a vector multiplication is performed, the multiplier may be configured to generate and select partial products so as to effectively isolate the multiplication process for each pair of vector components. The multiplier may also be configured to sum the products of the vector components to form the vector dot product. The final product may be output in segments so as to require fewer bus lines. The segments may be rounded by adding a rounding constant. Rounding and normalization may be performed in two paths, one assuming an overflow will occur, the other assuming no overflow will occur. The multiplier may also be configured to perform iterative calculations to evaluate constant powers of an operand. Intermediate products that are formed may be rounded and normalized in two paths and then compressed and stored for use in the next iteration. An adjustment constant may also be added to increase the frequency of exactly rounded results.

...read moreread less

Book Chapter•10.1007/3-540-45600-7_21•

Efficient Software Implementation for Finite Field Multiplication in Normal Basis

[...]

Peng Ning¹, Yiqun Lisa Yin•Institutions (1)

North Carolina State University¹

13 Nov 2001

TL;DR: New techniques for efficient software implementation of binary field multiplication in normal basis are presented, which are more efficient in terms of both speed and memory compared with alternative approaches.

...read moreread less

Abstract: Finite field arithmetic is becoming increasingly important in today's computer systems, particularly for implementing cryptographic operations. Among various arithmetic operations, finite field multiplication is of particular interest since it is a major building block for elliptic curve cryptosystems. In this paper, we present new techniques for efficient software implementation of binary field multiplication in normal basis. Our techniques are more efficient in terms of both speed and memory compared with alternative approaches.

...read moreread less

Book Chapter•10.1007/3-540-44586-2_26•

Compact Encoding of Non-adjacent Forms with Applications to Elliptic Curve Cryptography

[...]

Marc Joye, Christophe Tymen

13 Feb 2001

TL;DR: A compact encoding of non-adjacent representations that allows to skip the exponent recoding step and a straightforward technique for picking random numbers that alreadysatisfy the non-adjacence property is proposed.

...read moreread less

Abstract: Techniques for fast exponentiation (multiplication) in various groups have been extensively studied for use in cryptographic primitives. Specifically, the coding of the exponent (multiplier) plays an important role in the performances of the algorithms used. The crucial optimization relies in general on minimizing the Hamming weight of the exponent (multiplier). This can be performed optimally with nonadjacent representations. This paper introduces a compact encoding of non-adjacent representations that allows to skip the exponent recoding step. Furthermore, a straightforward technique for picking random numbers that alreadysatisfythe non-adjacence propertyis proposed. Several examples of application are given, in particular in the context of scalar multiplication on elliptic curves.

...read moreread less

Book Chapter•10.1007/3-540-44687-7_11•

Accelerating Matrix Product on Reconfigurable Hardware for Signal Processing

[...]

Abbes Amira¹, Ahmed Bouridane¹, Peter Milligan¹•Institutions (1)

Queen's University Belfast¹

27 Aug 2001

TL;DR: This paper investigates how some of the new features of the Xilinx Virtex FPGA may be used to support efficient and optimised implementation of matrix product based on Multiply and Accumulate such operations are frequently used in signal applications.

...read moreread less

Abstract: This paper investigates how some of the new features of the Xilinx Virtex FPGA may be used to support efficient and optimised implementation of matrix product based on Multiply and Accumulate (MAC) such operations are frequently used in signal applications. The principle new features that have been investigated are the Block RAM and the fully digital Delay-Locked Loop (DLL). The approach used for the matrix multiplication algorithm employs the idea used in the modified Booth encoder multiplication using Wallace Trees addition. Preliminary performance results and comparisons with similar algorithms implemented on multi-FPGA platforms have shown better performance for the proposed architecture.

...read moreread less

Patent•

Methods and apparatus for efficient complex long multiplication and covariance matrix implementation

[...]

Gerald George Pechanek¹, Ricardo Rodriguez¹, Matthew K Plonski¹, David Carl Strube¹, Kevin Coopman¹ - Show less +1 more•Institutions (1)

Altera¹

1 Nov 2001

TL;DR: In this paper, a parallel array VLIW digital signal processor is employed along with specialized complex long multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation.

...read moreread less

Abstract: Efficient computation of complex long multiplication results and an efficient calculation of a covariance matrix are described. A parallel array VLIW digital signal processor is employed along with specialized complex long multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs may be used allowing the complex multiplication pipeline hardware to be efficiently used.

...read moreread less

The Multiplication of Modernity

[...]

Johann P. Arnason

1 Jan 2001

Fast Normal Basis Multiplication Using General Purpose Processors (Extended Abstract)

[...]

Arash Reyhani-Masoleh, M. Anwar Hasan

1 Jan 2001

TL;DR: In this paper, a vector-level algorithm for normal basis multipli- cation over the extended binary field GF(2 m ) is presented. But the vector level algorithm does not address the problem of normal base multiplication in hardware implementation.

...read moreread less

Abstract: For cryptographic applications, normal bases have received considerable attention, especially for hardware implementation. In this article, we consider fast software algorithms for normal basis multipli- cation over the extended binary field GF(2 m ). We present a vector-level algorithm which essentially eliminates the bit-wise inner products needed in the conventional approach to the normal basis multiplication. We then present another algorithm which significantly reduces the dynamic in- struction counts. Both algorithms utilize the full width of the data-path of the general purpose processor on which the software is to be exe- cuted. We also consider composite fields and present an algorithm which can provide further speed-up and an added flexibility toward hardware- software co-design of processors for very large finite fields.

...read moreread less

Proceedings Article•10.1109/ARITH.2001.930100•

Binary multiplication radix-32 and radix-256

[...]

P.-M. Seidel¹, Lee D. McFearin¹, David W. Matula¹•Institutions (1)

Southern Methodist University¹

11 Jun 2001

TL;DR: Novel encoding schemes for the implementation of higher radix multiplication are proposed, which provide more flexible multiplier designs that can be implemented in shorter pipeline stages and compare the proposed designs with multipliers that use traditional Booth recoding.

...read moreread less

Abstract: Multipliers are used at many different places in microprocessor design. As the non-memory sub-blocks of the microprocessor with the largest size and delay, multipliers have a big impact on the cycle time of the microprocessor. Targeting deeper pipelines and higher clock frequencies, there is a growing demand for multiplier designs that can be split into shorter stages. For this purpose, the use of Booth recoding has been a popular method to cut down the number of partial products in a multiplier to reduce the delay of the partial product accumulation and to simplify the partition of the multiplier into several shorter stages. The complexity to pre-compute an increasing number of digit multiples of the multiplicand within the multiplier unit limits the use of Booth recoding mainly to radices 4 and 8. We propose novel encoding schemes for the implementation of higher radix multiplication. In particular we consider multiplication radix-32 and radix-256. The features provide more flexible multiplier designs that can be implemented in shorter pipeline stages. We compare the proposed designs with multipliers that use traditional Booth recoding.

...read moreread less

Proceedings Article•10.1109/ISCAS.2001.922295•

A flexible multiplication unit for an FPGA logic block

[...]

K. Rajagopalan¹, Peter Sutton•Institutions (1)

University of Queensland¹

6 May 2001

TL;DR: A flexible multiplication unit and configurable carry logic circuitry suitable for incorporation into a FPGA logic block are proposed, based on a modified carry-save adder that efficiently supports multiplication, addition and multiply accumulate operations in serial or parallel form.

...read moreread less

Abstract: FPGAs are increasingly being applied to DSP applications but are often inefficient in space and time compared with dedicated DSP chips, particularly for multiplication-based operations. To improve FPGA arithmetic performance, a flexible multiplication unit and configurable carry logic circuitry suitable for incorporation into a FPGA logic block are proposed. The multiplier unit is based on a modified carry-save adder and along with the carry logic circuitry efficiently supports multiplication, addition and multiply accumulate operations in serial or parallel form. Preliminary results indicate logic utilization for a multiplier implementation in such an FPGA is approximately a third that of the XC 4000 architecture and half that of the Virtex architecture. Propagation delays are also reduced due to the use of dedicated inter-block interconnect for all sum and carry signals and flexible routing multiplexers.

...read moreread less

Book Chapter•10.1007/3-540-44693-1_49•

New Bounds on the OBDD-Size of Integer Multiplication via Universal Hashing

[...]

Philipp Woelfel

15 Feb 2001

TL;DR: In this paper, a stronger bound of 2n/2/61 was proven by a new technique, using a recently found universal family of hash functions, and a first non-trivial upper bound of 7/3 ċ 24n/3 for the OBDD size of MULn-1,n was provided.

...read moreread less

Abstract: Ordered binary decision diagrams (OBDDs) nowadays belong to the most common representation types for Boolean functions. Although they allow important operations such as satisfiability test and equality test to be performed efficiently, their limitation lies in the fact that they may require exponential size for important functions. Bryant [8] has shown that any OBDD-representation of the function MULn-1,n, which computes the middle bit of the product of two n-bit numbers, requires at least 2n/8 nodes. In this paper a stronger bound of 2n/2/61 is proven by a new technique, using a recently found universal family of hash functions [23]. As a result, one cannot hope anymore to find reasonable small OBDDs even for the multiplication of relatively short integers, since for only a 64-bit multiplication millions of nodes are required. Further, a first non-trivial upper bound of 7/3 ċ 24n/3 for the OBDD size of MULn-1,n is provided.

...read moreread less

Book Chapter•10.1007/3-540-35767-X_17•

Strength reduction of integer division and modulo operations

[...]

Jeffrey Sheldon¹, Walter Lee¹, Ben Greenwald¹, Saman Amarasinghe¹•Institutions (1)

Massachusetts Institute of Technology¹

1 Aug 2001

TL;DR: A suite of optimizations for eliminating division, modulo, and remainder operations from programs are described, analogous to strength reduction techniques used for multiplications.

...read moreread less

Abstract: Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behavior in queues using ring buffers. In addition, they appear frequently in address computations as a result of compiler optimizations that improve data locality, perform data distribution, or enable parallelization. Experienced application programmers, however, avoid them because they are slow. Furthermore, while advances in both hardware and software have improved the performance of many parts of a program, few are applicable to division and modulo operations. This trend makes these operations increasingly detrimental to program performance. This paper describes a suite of optimizations for eliminating division, modulo, and remainder operations from programs. These techniques are analogous to strength reduction techniques used for multiplications. In addition to some algebraic simplifications, we present a set of optimization techniques that eliminates division and modulo operations that are functions of loop induction variables and loop constants. The optimizations rely on algebra, integer programming, and loop transformations.

...read moreread less

Book Chapter•10.1007/978-94-010-0834-1_10•

Frobenius manifolds and variance of the spectral numbers

[...]

Claus Hertling¹•Institutions (1)

University of Bonn¹

1 Jan 2001

TL;DR: A Frobenius manifold is a complex manifold with a multiplication and a metric on the holomorphic tangent bundle and two distinguished vector fields satisfying a series of natural conditions.

...read moreread less

Abstract: A Frobenius manifold is a complex manifold with a multiplication and a metric on the holomorphic tangent bundle and two distinguished vector fields, satisfying a series of natural conditions.

...read moreread less

...

Expand