TL;DR: It is shown that, not only does this type of multiplier contain redundancy in that special class of finite fields, but it also has redundancy in fields GF(2/sup m/) defined by any irreducible polynomial, and a new architecture for the normal basis parallel multiplier is proposed, which is applicable to any arbitrary finite field and has significantly lower circuit complexity compared to the original Massey-Omura normal basis Parallel multiplier.
Abstract: The Massey-Omura multiplier of GF(2/sup m/) uses a normal basis and its bit parallel version is usually implemented using m identical combinational logic blocks whose inputs are cyclically shifted from one another. In the past, it was shown that, for a class of finite fields defined by irreducible all-one polynomials, the parallel Massey-Omura multiplier had redundancy and a modified architecture of lower circuit complexity was proposed. In this article, it is shown that, not only does this type of multiplier contain redundancy in that special class of finite fields, but it also has redundancy in fields GF(2/sup m/) defined by any irreducible polynomial. By removing the redundancy, we propose a new architecture for the normal basis parallel multiplier, which is applicable to any arbitrary finite field and has significantly lower circuit complexity compared to the original Massey-Omura normal basis parallel multiplier. The proposed multiplier structure is also modular and, hence, suitable for VLSI realization. When applied to fields defined by the irreducible all-one polynomials, the multiplier's circuit complexity matches the best result available in the open literature.
TL;DR: A hardware solution for finite field arithmetic with application in asymmetric cryptography, ready for future cryptographic bitlengths and allow operation at high clock frequency on moderate hardware resources is presented.
Abstract: In this article we present a hardware solution for finite field arithmetic with application in asymmetric cryptography. It supports calculation in GF(p) as well as in GF(2m). Addition and multiplication with interleaved modular reduction are the main functionality of the unit. Additional functions--like shift operations and integer incrementation--allow the calculation of the multiplicative inverse and covering all operations required to implement Elliptic Curve Cryptography. Redundant number representation and efficient modular reduction make it ready for future cryptographic bitlengths and allow operation at high clock frequency on moderate hardware resources.
TL;DR: A novel scalable and unified architecture for a Montgomery inverse hardware that operates in both GF(p) and GF(2n) fields is proposed, which allows the hardware to compute the inverse of long precision numbers in a repetitive way.
Abstract: Computing the inverse of a number in finite fields GF(p) or GF(2n) is equally important for cryptographic applications. This paper proposes a novel scalable and unified architecture for a Montgomery inverse hardware that operates in both GF(p) and GF(2n) fields. We adjust and modify a GF(2n) Montgomery inverse algorithm to accommodate multi-bit shifting hardware, making it very similar to a previously proposed GF(p) algorithm. The architecture is intended to be scalable, which allows the hardware to compute the inverse of long precision numbers in a repetitive way. After implementing this unified design it was compared with other designs. The unified hardware was found to be eight times smaller than another reconfigurable design, with comparable performance. Even though the unified design consumes slightly more area and it is slightly slower than the scalable inverter implementations for GF(p) only, it is a practical solution whenever arithmetic in the two finite fields is needed.
TL;DR: Two efficient architectures for digit-serial normal basis (DSNB) multipliers over GF(2/sup m/) are presented and are compared with the existing ones in terms of gate and time complexities.
Abstract: In this article, two efficient architectures for digit-serial normal basis (DSNB) multipliers over GF(2/sup m/) are presented. These two structures have the same gate count and time complexity. A straightforward implementation leaves gate redundancy in both of them. An algorithm which can considerably reduce the redundancy is also developed. Moreover, the proposed architectures are compared with the existing ones in terms of gate and time complexities.
TL;DR: An Elliptic Curve Point Multiplication processor over base fields GF(2m), suitable for use in a wide range of commercial cryptography applications and comparing the results with recent implementations in terms of speed and security is compared.
Abstract: In this paper we present an Elliptic Curve Point Multiplication processor over base fields GF(2m), suitable for use in a wide range of commercial cryptography applications. Our design operates in a polynomial basis is fully parameterizable in the irreducible polynomial and the chosen Elliptic Curve over any base Galois Field up to a given size. High performance is achieved by use of a dedicated Galois Field arithmetic coprocessor implemented on FPGA. The underlying FPGA architecture is used to increase calculation performance, taking advantage of the properties of this kind of programmable logic device to perform the large number of logical operations required. We discuss the performance of our processor for different Elliptic Curves and compare the results with recent implementations in terms of speed and security.
TL;DR: The proposed processor uses the extended Euclidean algorithm for field division and the LSB-first procedure for field multiplication and accepts an external irreducible polynomial and allows several field sizes with small area overhead.
Abstract: This paper proposes a compact finite field processor over GF(2/sup m/) using polynomial basis. The proposed processor uses the extended Euclidean algorithm for field division and the LSB-first procedure for field multiplication. Addition, multiplication, and division are implemented directly sharing a common datapath hardware. The presented processor accepts an external irreducible polynomial and allows several field sizes with small area overhead The proposed processor requires (6m/sup 2/+16m+11m/spl lceil/m/8/spl rceil/-16/spl lceil/m/8/spl rceil/-17) cycles for elliptic curve scalar multiplication over GF(2/sup m/) using double-addition method We were able to implement a finite field processor over GF(2/sup 192/) with 16,847 gate counts.
TL;DR: This work describes a reconfigurable finite field multiplier, which is implemented within the latest family of Field Programmable System Level Integrated Circuits FPSLIC from Atmel, Inc.
Abstract: The performance of elliptic curve based public key cryptosystems is mainly appointed by the efficiency of the underlying finite field arithmetic. This work describes a reconfigurable finite field multiplier, which is implemented within the latest family of Field Programmable System Level Integrated Circuits FPSLIC from Atmel, Inc. The architecture of the coprocessor is adapted from Karatsuba’s divide and conquer algorithm and allows for a reasonable speedup of the top-level public key algorithms. The VHDL hardware models are automatically generated based on an eligible operand size, which permits the optimal utilization of a particular FPSLIC device.
TL;DR: A lower bound to the number of AND gates used in parallel multipliers for GF(2/sup m/), under the condition that time complexity be minimum, indirectly suggests that space complexity is essentially a quadratic function of m when time complexity is kept minimum.
Abstract: A lower bound to the number of AND gates used in parallel multipliers for GF(2/sup m/), under the condition that time complexity be minimum, is determined. In particular, the exact minimum number of AND gates for primitive normal bases and optimal normal bases of Type II multipliers is evaluated. This result indirectly suggests that space complexity is essentially a quadratic function of m when time complexity is kept minimum.
TL;DR: This work presents a novel area-efficient parallel-in parallel-out systolic division circuit (v = a/b) over GF(2/sup m/) based on the extended Stein's algorithm that exhibits significant advantages in both area and time.
Abstract: We present a novel area-efficient parallel-in parallel-out systolic division circuit (v = a/b) over GF(2/sup m/) based on the extended Stein's algorithm. By keeping the combined area-time (AT) complexity at the lowest level of O(m/sup 2/), we evenly distribute the complexity of O(m) in area and time, and design a well-balanced division circuit capable of operating at high speed with high area efficiency. Compared to the other systolic architectures, our design exhibits significant advantages in both area and time.
TL;DR: This characterization shows that the representation of finite fields described in a previous issue of the IEEE Transactions on Computers is not "optimal" as claimed and the representation considered there can often be improved significantly.
Abstract: For original article see G. Drolet, ibid., vol. 47, no. 9, p. 938-946, (Sept 1998). We characterize the smallest n with GF(2)[X]/(X/sup n/ + 1) containing an isomorphic copy of GF(2/sup m/). This characterization shows that the representation of finite fields described in a previous issue of the IEEE Transactions on Computers is not "optimal" as claimed. The representation considered there can often be improved significantly.
TL;DR: A detailed analysis on t-nomial multiples of products of primitive polynomials of standard model of nonlinear combiner generator for stream cipher system and presents new enumeration results for these multiples and provides some estimation on their degree distribution.
Abstract: A standard model of nonlinear combiner generator for stream cipher system combines the outputs of several independent Linear Feed-back Shift Register (LFSR) sequences using a nonlinear Boolean function to produce the key stream. Given such a model, cryptanalytic attacks have been proposed by finding the sparse multiples of the connection polynomials corresponding to the LFSRs. In this direction recently a few works are published on t-nomial multiples of primitive polynomials. We here provide further results on degree distribution of the t-nomial multiples. However, getting the sparse multiples of just a single primitive polynomial does not suffice. The exact cryptanalysis of the nonlinear combiner model depends on finding sparse multiples of the products of primitive polynomials. We here make a detailed analysis on t-nomial multiples of products of primitive polynomials. We present new enumeration results for these multiples and provide some estimation on their degree distribution.
TL;DR: This paper proves some important results related to the degree of the multiples and discusses a randomized algorithm for finding sparse multiples of primitive polynomials and their products.
Abstract: Recently the problem of analysing the multiples of primitive polynomials and their products has received a lot of attention. These primitive polynomials are basically the connection polynomials of the LFSRs (Linear Feedback Shift Registers) used in the stream cipher system. Analysis of sparse multiples of a primitive polynomial or product of primitive polynomials helps in identifying the robustness of the stream ciphers based on nonlinear combiner model. In this paper we first prove some important results related to the degree of the multiples. Earlier these results were only observed for small examples. Proving these results clearly identify the statistical behavior related to the degree of multiples of primitive polynomials or their products. Further we discuss a randomized algorithm for finding sparse multiples of primitive polynomials and their products. Our results clearly identify the time memory trade off for finding such multiples.
TL;DR: A new digit serial GF(2 m ) multiplier based on the dual basis representation is presented for the first time in this paper, and has low latency, and its digit size is not restricted by the type of primitive polynomial being used.
Abstract: A new digit serial GF(2 m ) multiplier based on the dual basis representation is presented for the first time in this paper. The multiplier is suitable for large word lengths such as those found in cryptosystems. Digit serial computations give a much better trade-off between area and speed in comparison with bit-parallel realization, which is too costly, and bit-serial realization which is too slow. The new multiplier is based on a look-ahead technique which serves to overcome the recursive algorithm used to calculate the extra elements of the operand represented in the dual basis prior to the multiplication process. This recursive algorithm is the main bottleneck for digit-serial multiplication. Unlike existing design, the new multiplier has low latency, and its digit size is not restricted by the type of primitive polynomial being used. A systolic version of the new multiplier, suitable for VLSI implementation, is also presented.
TL;DR: New classes of LI logic transformations and their corresponding polynomial expansions over GF(2) are identified and introduced and the transforms are the fastest and most efficient LI transformation in terms of its GF( 2) computational complexity.
Abstract: Recent papers show that the existence of numerous linearly independent (LI) transformations in GF(2) algebra creates circuits that are superior in the design of XOR based polynomial expansions and corresponding digital circuits. In this paper, new classes of LI logic transformations and their corresponding polynomial expansions over GF(2) are identified and introduced. The transforms are the fastest and most efficient LI transformation in terms of its GF(2) computational complexity.
TL;DR: The ECC processor provides the elliptic curve operations for Diffie-Hellman, EC Elgamal and ECDSA protocols and is defined over the field GF(2 163), which is a SEC-2 recommendation.
Abstract: crypto (ECC) coprocessor over binary fields for ECC protocols. Our ECC processor provides the elliptic curve operations for Diffie-Hellman, EC Elgamal and ECDSA protocols. The ECC we have implemented is defined over the field GF(2 163),which is a SEC-2 recommendation [6].
TL;DR: A new redundant binary adder that supports carry-save additions under either of the Galois fields, GF(p) or GF(2/sup n/), without the need for an external control signal to specify which field is to be used.
Abstract: This paper describes a new redundant binary adder that supports carry-save additions under either of the Galois fields, GF(p) or GF(2/sup n/), without the need for an external control signal to specify which field is to be used. The proposed adder will find use in unified Galois field multipliers for cryptographic applications. Its main advantage over previously reported adders is that a control signal which is broadcast to all cells to suppress carries under GF(2/sup n/) is not needed, leading to a substantial gain in implementation efficiency.
TL;DR: In this article, the authors proposed an optimal extension field for XTR among Galois fields GF(p6m) which can be applied to XTR and proposed a new notion of generalized optimal extension fields (GOEFs) to select such fields.
Abstract: Application of XTR in cryptographic protocols leads to substantial savings both in communication and computational overhead without compromising security [6]. XTR is a new method to represent elements of a subgroup of a multiplicative group of a finite field GF(p6) and it can be generalized to the field GF(p6m) [6,9]. This paper proposes optimal extension fields for XTR among Galois fields GF(p6m) which can be applied to XTR. In order to select such fields, we introduce a new notion of Generalized Optimal Extension Fields(GOEFs) and suggest a condition of prime p, a defining polynomial of GF(p2m) and a fast method of multiplication in GF(p2m) to achieve fast finite field arithmetic in GF(p2m). From our implementation results, GF(p36) ? GF(p12) is the most efficient extension fields for XTR and computing Tr(gn) given Tr(g) in GF(p12) is on average more than twice faster than that of the XTR system[6,10] on Pentium III/700MHz which has 32-bit architecture.
TL;DR: The fundamentals of Galois fields are reviewed and multiplication of finite-field elements using three different representation bases are considered and experimental results are presented to compare the performance of these multipliers.
Abstract: Galois (or finite) fields are used in a wide number of technical applications, playing an important role in several areas such as cryptographic schemes and algebraic codes, used in modern digital communication systems. Finite field arithmetic must be fast, due to the increasing performance needed by communication systems, so it might be necessary for the implementation of the modules performing arithmetic over Galois fields on semiconductor integrated circuits. Galois field multiplication is the most costly arithmetic operation and different approaches can be used. In this paper, the fundamentals of Galois fields are reviewed and multiplication of finite-field elements using three different representation bases are considered. These three multipliers have been implemented using a bit-parallel architecture over reconfigurable hardware and experimental results are presented to compare the performance of these multipliers.
TL;DR: A new bit-parallel systolic multiplier for GF(2/sup m/) using the weakly dual basis and the latency of the multiplier only requires m+[log/sub 2/m] clock cycles.
Abstract: This paper offers a new bit-parallel systolic multiplier for GF(2/sup m/) using the weakly dual basis. The multiplier is composed of two units $multiplication and transformation. The structure of the multiplication unit includes m/sup 2/ cells, each cell is composed of one 2-input AND gate, one 2-input XOR gate and three/four 1-bit latches. The structure of the transformation unit is established by the 2-input XOR-tree. The latency of the multiplier only requires m+[log/sub 2/m] clock cycles.
TL;DR: A flexible scalar (or point) multiplier for elliptic curve cryptosystems using this polynomial basis multiplier is implemented and it is found that the flexible system performs almost twice as fast as compared with the classical multiplier.
Abstract: In this paper we will present a hardware implementation of a GF(2n) polynomial basis multiplier that is twice as fast a the classical multiplier while requiring about 50 % more chip area. We implement a flexible scalar (or point) multiplier for elliptic curve cryptosystems using this multiplier and find that the flexible system performs almost twice as fast as compared with the classical multiplier.
TL;DR: In this article, a method of using Normal Basis to design a universtal serial multiplier over Galois field, for example GF(2 8), is introduced, which is used to solve the problem of serial multiplier design.
Abstract: A method of using Normal Basis to design a universtal serial multiplier over Galois field, for example GF(2 8),is introduced.
TL;DR: In this paper, a method for multiplication of two factors from the Galois field GF (2m asterisk p) is presented, whereby each factor can be presented as a vector of p partial blocks with a width of m bits.
Abstract: Method for multiplication of two factors from the Galois field GF (2m asterisk p), whereby each factor can be presented as a vector of p partial blocks with a width of m bits. The method involves selection of a reduction polynomial, multiplicative linking of the partial blocks, accumulation of an intermediate result with a reduction of the accumulated intermediate result after each multiplicative linking. An Independent claim is made for a second method for multiplication of two factors from the Galois field.
TL;DR: A generalization of the Pless symmetry codes to different fields is presented and it is proven that the automorphism group of some of these codes contains the group PSL2(q).
TL;DR: In this article, a nonconventional basis was introduced and a new bit-parallel multiplier was presented, which is as efficient as the modified Massey-Omura multiplier using the type I optimal normal basis.
Abstract: The efficient computation of the arithmetic operations in finite fields is closely related to the particular ways in which the field elements are presented. The common field representations are a polynomial basis representation and a normal basis representation. In this paper, we introduce a nonconventional basis and present a new bit-parallel multiplier which is as efficient as the modified Massey-Omura multiplier using the type I optimal normal basis.