TL;DR: This work presents division-free formulae, which multiply two 5-term polynomials with 13 scalar multiplications, two 6- term polynmials with 17 scalarmultiplications, and two 7-termPolynomial with 22 scalar multiplier, and describes their application to elliptic curve arithmetic over binary fields.
Abstract: The Karatsuba-Ofman algorithm starts with a way to multiply two 2-term (i.e., linear) polynomials using three scalar multiplications. There is also a way to multiply two 3-term (i.e., quadratic) polynomials using six scalar multiplications. These are used within recursive constructions to multiply two higher-degree polynomials in subquadratic time. We present division-free formulae, which multiply two 5-term polynomials with 13 scalar multiplications, two 6-term polynomials with 17 scalar multiplications, and two 7-term polynomials with 22 scalar multiplications. These formulae may be mixed with the 2-term and 3-term formulae within recursive constructions, leading to improved bounds for many other degrees. Using only the 6-term formula leads to better asymptotic performance than standard Karatsuba. The new formulae work in any characteristic, but simplify in characteristic 2. We describe their application to elliptic curve arithmetic over binary fields. We include some timing data.
TL;DR: A bit parallel structure for a multiplier withLow complexity in Galois fields is introduced and a complete set of primitive field polynomials for composite fields is provided which perform module reduction with low complexity.
Abstract: A bit parallel structure for a multiplier with low complexity in Galois fields is introduced. The multiplier operates over composite fields GF((2/sup n/)/sup m/), with k=nm. The Karatsuba-Ofman algorithm (A. Karatsuba and Y. Ofmanis, 1963) is investigated and applied to the multiplication of polynomials over GF(2/sup n/). It is shown that this operation has a complexity of order O(k/sup log23/) under certain constraints regarding k. A complete set of primitive field polynomials for composite fields is provided which perform module reduction with low complexity. As a result, multipliers for fields GF(2/sup k/) up to k=32 with low gate counts and low delays are listed. The architectures are highly modular and thus well suited for VLSI implementation.
TL;DR: In this paper, the authors generalize the classical Karatsuba Algorithm (KA) for polynomial multiplication to polynomials of arbitrary degree and recursive use, and provide detailed information on how to use the KA with least cost.
Abstract: In this work we generalize the classical Karatsuba Algorithm (KA) for polynomial multiplication to (i) polynomials of arbitrary degree and (ii) recursive use. We determine exact complexity expressions for the KA and focus on how to use it with the least number of operations. We develop a rule for the optimum order of steps if the KA is used recursively. We show how the usage of dummy coefficients may improve performance. Finally we provide detailed information on how to use the KA with least cost, and also provide tables that describe the best possible usage of the KA for polynomials up to a degree of 127. Our results are especially useful for efficient implementations of cryptographic and coding schemes over fixed-size fields like GF (p).
TL;DR: The efficiency of Urdhva Triyagbhyam-Vedic method for multiplication is proved which strikes a difference in the actual process of multiplication itself, which enables parallel generation of intermediate products, eliminates unwanted multiplication steps with zeros and scaled to higher bit levels using Karatsuba algorithm.
Abstract: The ever increasing demand in enhancing the ability of processors to handle the complex and challenging processes has resulted in the integration of a number of processor cores into one chip. Still the load on the processor is not less in generic system. This load is reduced by supplementing the main processor with Co-Processors, which are designed to work upon specific type of functions like numeric computation, Signal Processing, Graphics etc. The speed of ALU depends greatly on the multiplier. In algorithmic and structural levels, numerous multiplication techniques have been developed to enhance the efficiency of the multiplier which concentrates in reducing the partial products and the methods of their addition but the principle behind multiplication remains the same in all cases. Vedic Mathematics is the ancient system of mathematics which has a unique technique of calculations based on 16 Sutras. Employing these techniques in the computation algorithms of the coprocessor will reduce the complexity, execution time, area, power etc. Though there are many sutras employed to handle different sets of numeric, exploring each one gives new results. Our work has proved the efficiency of Urdhva Triyagbhyam-Vedic method for multiplication which strikes a difference in the actual process of multiplication itself. It enables parallel generation of intermediate products, eliminates unwanted multiplication steps with zeros and scaled to higher bit levels using Karatsuba algorithm with the compatibility to different data types. This sutra is to be used to build a high speed power efficient multiplier in the coprocessor.
TL;DR: The work has proved the efficiency of Urdhva Triyagbhyam- Vedic method for multiplication which strikes a difference in the actual process of multiplication itself, giving minimum delay for multiplication of all types of numbers, either small or large.
Abstract: This paper proposed the design of high speed Vedic Multiplier using the techniques of Ancient Indian Vedic Mathematics that have been modified to improve performance. Vedic Mathematics is the ancient system of mathematics which has a unique technique of calculations based on 16 Sutras. The work has proved the efficiency of Urdhva Triyagbhyam- Vedic method for multiplication which strikes a difference in the actual process of multiplication itself. It enables parallel generation of intermediate products, eliminates unwanted multiplication steps with zeros and scaled to higher bit levels using Karatsuba algorithm with the compatibility to different data types. Urdhva tiryakbhyam Sutra is most efficient Sutra (Algorithm), giving minimum delay for multiplication of all types of numbers, either small or large. Further, the Verilog HDL coding of Urdhva tiryakbhyam Sutra for 32x32 bits multiplication and their FPGA implementation by Xilinx Synthesis Tool on Spartan 3E kit have been done and output has been displayed on LCD of Spartan 3E kit. The synthesis results show that the computation time for calculating the product of 32x32 bits is 31.526 ns.