TL;DR: This work presents several novel exponentiation algorithms, namely, a protected square-and-multiply algorithm, its right-to-left counterpart, and several protected sliding-window algorithms, which share the common feature that the complexity is globally unchanged compared to the corresponding unprotected implementations.
Abstract: We introduce simple methods to convert a cryptographic algorithm into an algorithm protected against simple side-channel attacks. Contrary to previously known solutions, the proposed techniques are not at the expense of the execution time. Moreover, they are generic and apply to virtually any algorithm. In particular, we present several novel exponentiation algorithms, namely, a protected square-and-multiply algorithm, its right-to-left counterpart, and several protected sliding-window algorithms. We also illustrate our methodology applied to point multiplication on elliptic curves. All these algorithms share the common feature that the complexity is globally unchanged compared to the corresponding unprotected implementations.
TL;DR: A new arithmetical principle is proposed and a new method is proposed that is easy to interpret the multiplication operation with the membership functions of fuzzy numbers and the canonical representation of multiplication operation on fuzzy numbers is computed.
Abstract: The representation of multiplication operation on fuzzy numbers is very useful and important in the fuzzy system such as the fuzzy decision making. In this paper, we propose a new arithmetical principle and a new arithmetical method for the arithmetical operations on fuzzy numbers. The new arithmetical principle is the L−1-R−1 inverse function arithmetic principle. Based on the L−1-R−1 inverse function arithmetic principle, it is easy to interpret the multiplication operation with the membership functions of fuzzy numbers. The new arithmetical method is the graded multiple integrals representation method. Based on the graded multiple integrals representation method, it is easy to compute the canonical representation of multiplication operation on fuzzy numbers. Finally, the canonical representation is applied to a numerical example of fuzzy decision.
TL;DR: In this paper, non-binary computing methods utilize a digital multistage phase change material to perform addition, subtraction, multiplication, and division with the controlled application of energy.
Abstract: Non-binary computing methods utilize a digital multistage phase change material. Addition, subtraction, multiplication, and division are accomplished with the controlled application of energy to a phase change material. Energy in an amount insufficient to set the reset state of a phase change material is provided to store one or more numbers and further energy characteristic of the performance of the mathematical operation is provided to effect a computation
TL;DR: In this article, a CMD(Charge Multiplying Detector)-CCD image pickup element 106 mounted on the tip of an endoscope picks up a living body observing section.
Abstract: PROBLEM TO BE SOLVED: To provide an endoscope provided with a solid-state image pickup means having an electric charge multiplier section for multiplying electric charges with a multiplication factor on the basis of a multiplication factor control signal that can measure the multiplication factor of electric charges SOLUTION: A CMD(Charge Multiplying Detector)-CCD image pickup element 106 mounted on a tip of an endoscope picks up a living body observing section In this case, a reference output signal value is stored on the basis of signal charges of an image picked up at a shade area 32 and multiplied by an electric charge multiplication section 24 to which a multiplication factor control signal denoting about 1 as the multiplication factor is applied Further, a reference output signal value is stored on the basis of signal charges of an image picked up at the shade area 32 and multiplied by the electric charge multiplication section 24 to which a multiplication factor control signal denoting N as the multiplication factor is applied under a reference temperature environment By dividing the output signal value by the reference output signal value, an actual multiplication factor N' of the electric charge multiplication section 24 is calculated The multiplication factor can be measured again by controlling the voltage of the multiplication factor control signal so that the multiplication factor N' approaches the multiplication factor N The measurement of the multiplication factor and the voltage adjustment are repeated to realize the multiplication factor N
TL;DR: The present findings suggest that multiplication facts are stored in a highly related network in which activation spreads from the product node to adjacent nodes.
Abstract: Adult observers are widely assumed to be equipped with a specific memory store containing arithmetic facts. The present study was aimed at exploring the possibility of obtaining an automatic activation of multiplication facts by using the number-matching paradigm (LeFevre, Bisanz, & Mrkonjic, 1988), in which mental arithmetic is task irrelevant. In particular, we were interested in exploring whether the nodes that precede or follow the product node in the multiplication table can also be automatically activated as a consequence of the mere presentation of two numbers. In Experiments 1 and 2, we showed that participants were slower in responding “no“ to probes that were numbers adjacent to the product in the table related to the first operand of the initial pair than to probes that were unrelated to the initial pair. In Experiments 3 and 4, we showed a similar pattern for probes that were numbers adjacent to the product in the table related to the second operand of the initial pair. Experiments 5 and 6 rul...
TL;DR: It is suggested that the arithmetic and the semantic incongruency effects are both functionally related to a context-dependent spread of activation in specialized associative networks, whereas the arithmetic problem-size effect is due to rechecking routines that go beyond basic fact retrieval.
Abstract: Event-related potentials were recorded with 61 electrodes from 16 students who verified either the correctness of singledigit multiplication problems or the semantic congruency of sentences. Multiplication problems varied in size and sentence fragments in constraint. Both semantic and arithmetic incongruencies evoked a typical N400 with a clear parieto-central maximum. In addition, numerically larger problems (8 7), in comparison to smaller problems (3 2), evoked a negativity starting at about 360 ms whose maximum was located over the right temporal-parietal scalp. These results indicate that the arithmetic incongruency and the problem-size effect are functionally distinct. It is suggested that the arithmetic and the semantic incongruency effects are both functionally related to a context-dependent spread of activation in specialized associative networks, whereas the arithmetic problem-size effect is due to rechecking routines that go beyond basic fact retrieval. Descriptors: Mental calculation, Problem-size effect, Arithmetic N400 effect, Semantic N400 effect, Memory access, Event-related potentials
TL;DR: A hardware architecture for modular multiplication operation which is efficient for bit-lengths suitable for both commonly used types of public key cryptography (PKC) and RSA cryptosystems is described, and modular exponentiation based on Montgomery's Multiplication Method (MMM) is presented.
Abstract: This paper describes a hardware architecture for modular multiplication operation which is efficient for bit-lengths suitable for both commonly used types of public key cryptography (PKC) i.e. ECC and RSA cryptosystems. The challenge of current PKC implementations is to deal with long numbers (160-2048 bits) in order to achieve system's efficiency, as well as security. RSA, still the most popular PKC, has at its root the modular exponentiation operation. Modular exponentiation consists of repeated modular multiplications, which is also the basic operation for ECC protocols. The solution proposed in this work uses a systolic array implementation and can be used for arbitrary precisions. We also present modular exponentiation based on Montgomery's Multiplication Method (MMM).
TL;DR: In this paper, a method for performing computations in a mathematical system which exhibits a positive lyapunov exponent, or exhibits chaotic behavior, comprises varying a parameter of the system, such as, e.g., in a pseudo-random number generator of a stream-cipher algorithm, in a blockcipher system or a HASH/MAC system, unpredictability may be improved.
Abstract: A method for performing computations in a mathematical system which exhibits a positive lyapunov exponent, or exhibits chaotic behavior, comprises varying a parameter of the system. When employed in cryptography, such as, e.g., in a pseudo-random number generator of a stream-cipher algorithm, in a block-cipher system or a HASH/MAC system, unpredictability may be improved. In a similar system, a computational method comprises multiphying two numbers and manipulating at least one of the most significant bits of the number resulting from the multiplication to produce an output. A number derived from a division of two numbers may be used for deriving an output. In a system for generating a sequence of numbers, an array of counters is updated at each computational step, whereby a carry value is added to each counter. Fixed-point arithmetic may be employed. A method of determining an identification value and for concurrently encrypting and/or decrypting a set of data is disclosed.
TL;DR: This result enables formal analysis of protocols that employ primitives such as Diffie-Hellman exponentiation, products, and xor, with a bounded number of role instances, but without imposing any bounds on the size of terms created by the attacker.
Abstract: We demonstrate that for any well-defined cryptographic protocol, the symbolic trace reachability problem in the presence of an Abelian operator (e.g., multiplication) can be reduced to solvability of a particular system of quadratic Diophantine equations. This result enables formal analysis of protocols that employ primitives such as Diffie-Hellman exponentiation, products, and xor, with a bounded number of role instances, but without imposing any bounds on the size of terms created by the attacker. In the case of xor, the resulting system of Diophantine equations is decidable. In the case of a general Abelian group, decidability remains an open equation, but our reduction demonstrates that standard mathematical techniques for solving systems of Diophantine equations are sufficient for the discovery of protocol insecurities.
TL;DR: An efficient method for truncated multiplication called hybrid-correction truncation is presented that utilizes the advantages of two previous methods to obtain lower average and maximum absolute error.
Abstract: Truncated multiplication can be used to significantly reduce the power dissipation for applications that do not require correctly-rounded results. This paper presents an efficient method for truncated multiplication called hybrid-correction truncation that utilizes the advantages of two previous methods to obtain lower average and maximum absolute error. Comparisons are presented contrasting power, area, and delay for all three methods compared to standard parallel multipliers. Estimates indicate that hybrid truncated multipliers dissipate slightly less power and consume slightly less area than previous methods for truncated multiplication. In addition, utilization of the hybrid truncation method can provide a method for altering the implementation within certain limits to meet a given precision.
TL;DR: This work outlines that multiplication of binary polynomials can be easily integrated into a multiplier datapath for integers without significant additional hardware, and presents new algorithms for multiple-precision arithmetic in GF(2/sup m/) based on the availability of an instruction for single- Precision multiplication ofbinary polynmials.
Abstract: The performance of elliptic curve (EC) cryptosystems depends essentially on efficient arithmetic in the underlying finite field. Binary finite fields GF(2/sup m/) have the advantage of "carry-free" addition. Multiplication, on the other hand, is rather costly since polynomial arithmetic is not supported by general-purpose processors. We propose a combined hardware/software approach to overcome this problem. First, we outline that multiplication of binary polynomials can be easily integrated into a multiplier datapath for integers without significant additional hardware. Then, we present new algorithms for multiple-precision arithmetic in GF(2/sup m/) based on the availability of an instruction for single-precision multiplication of binary polynomials. The proposed hardware/software approach is considerably faster than a "conventional" software implementation and well suited for constrained devices like smart cards. Our experimental results show that an enhanced 16 bit RISC processor is able to generate a 191 bit ECDSA signature in less than 650 msec when the core is clocked at 5 MHz.
TL;DR: This paper uses periodic matrix multiplication to improve the time complexities for a number of graph problems by reducing the time for finding a clique cutset in a graph.
Abstract: This paper uses periodic matrix multiplication to improve the time complexities for a number of graph problems. The time for finding a clique cutset in a graph is reduced from O(nm) to O(n2.69), the time for finding an asteroidal triple is reduced to O(n2.82), and the time for finding a star cutset, a two-pair, and a dominating pair is reduced from O(nm) to O(n2.79).It is also shown that each of these problems is at least as hard as one of three basic graph problems for which the best known algorithms run in time O(nm) and O(nα).
TL;DR: A vector-level algorithm is presented, which essentially eliminates the bit-wise inner products needed in the conventional approach to the normal basis multiplication of the extended binary field GF(2/sup m/).
Abstract: For cryptographic applications, normal bases have received considerable attention, especially for hardware implementation. We consider fast software algorithms for normal basis multiplication over the extended binary field GF(2/sup m/). We present a vector-level algorithm, which essentially eliminates the bit-wise inner products needed in the conventional approach to the normal basis multiplication. We then present another algorithm, which significantly reduces the dynamic instruction counts. Both algorithms utilize the full width of the data-path of the general purpose processor on which the software is to be executed. We also consider composite fields and present an algorithm, which can provide further speed-ups and an added flexibility toward hardware-software codesign of processors for very large finite fields.
TL;DR: In this paper, the authors used the dead-space multiplication recurrence theory to show that the low noise characteristics associated with the initial energy effect can be achieved by utilizing a two-layer multiplication region.
Abstract: It has been recently found that the initial-energy effect, which is associated with the finite initial energy of carriers entering the multiplication region of an avalanche photodiode (APD), can be tailored to reduce the excess noise well beyond the previously known limits for thin APDs. However, the control of the initial energy of injected carriers can be difficult in practice for an APD with a single multiplication layer. In this paper, the dead-space multiplication recurrence theory is used to show that the low noise characteristics associated with the initial-energy effect can be achieved by utilizing a two-layer multiplication region. As an example, a high bandgap Al/sub 0.6/Ga/sub 0.4/As material, termed the energy-buildup layer, is used to elevate the energy of injected carriers without incurring significant multiplication events, while a second GaAs layer with a lower bandgap energy is used as the primary carrier multiplication layer. Computations show that devices can be optimally designed through judicious choice of the charge-layer width to produce excess noise factor levels that are comparable to those corresponding to homojunction APDs benefiting from a maximal initial-energy effect. A structure is presented to achieve precisely that.
TL;DR: Overall, conceptual understanding of the relationship between multiplication and division was not as strong as that between addition and subtraction and it took participants longer to solve both types of multiplication/division problems.
Abstract: Problems of the form a + b - b have been used to assess conceptual understanding of the relationship between addition and subtraction. No study has investigated the same relationship between multiplication and division on problems of the form d x e ÷ e. In both types of inversion problems, no calculation is required if the inverse relationship between the operations is understood. Adult participants solved addition/subtraction and multiplication/division inversion (e.g., 9 x 22 ÷ 22) and standard (e.g., 2 + 27 - 28) problems. Participants started to use the inversion strategy earlier and more frequently on addition/subtraction problems. Participants took longer to solve both types of multiplication/division problems. Overall, conceptual understanding of the relationship between multiplication and division was not as strong as that between addition and subtraction. One explanation for this difference in performance is that the operation of division is more weakly represented and understood than the other operations and that this weakness affects performance on problems of the form d x e ÷ e. A primary area of interest in the field of mathematical cognition has focussed on participants' performance on simple arithmetic problems. In both the adult and child literature, participants' reaction times, accuracy, and strategy reports on these simple arithmetic problems have been assessed (e.g., Campbell, 1999; LePevre, Sadesky, & Bisanz, 1996; Siegler, 1987, 1989). Based on this task performance, insights into many different aspects of mathematical cognition have been gained regarding the strategics, and the variability of those strategies, that children and adults use (e.g., Geary, Frensch, & Wiley, 1993; Robinson, 2001) and how mathematical facts are stored in and retrieved from memory (e.g., Campbell, 1997; Mauro, LeFevre, & Morris, 2003). However, in the child mathematical cognition literature, a recent trend has been to assess not only performance on arithmetic problems but to also assess children's conceptual understanding of arithmetic (e.g., Rittle-Johnson, Siegler, & Alibali, 2001). Bisanz and LeFevre (1990) have proposed that there are three important types of arithmetical understanding: factual, procedural (or strategic), and conceptual. Number facts stored in memory, the availability of a diversity of solution procedures or strategies, and an understanding of mathematical concepts will all facilitate performance. Although much research has focussed on children's and adults' factual and procedural knowledge, historically, conceptual knowledge has been difficult to measure (Bisanz & LeFevre, 1990). However, one task that lends itself well to the assessment of conceptual understanding is the "inversion problem" (Bisanz & LePevre, 1990; Starkey & Gelman, 1982). In this type of problem, of the form a + & - b, participants who understand the inverse relationship between addition and subtraction can simply state the first number "a" rather than performing any calculations. Three measures are often used when investigating inversion problems. On these problems, accuracy should be high and reaction times fast, regardless of the magnitude of the numbers, if participants are simply stating the first number. Conversely, if participants are calculating, then there will be a problem-size effect. Verbal reports are collected to determine solution strategy directly. Research using inversion problems has found that even preschoolers are able to use an "inversion shortcut" to solve this type of problem (Klein & Bisanz, 2000). As children develop, they use the shortcut more frequently and more easily and by adulthood, the inversion shortcut is almost exclusively used to solve these inversion problems (Hisanz & LeFuvre, 1990). Thus, according to this particular task, by adulthood, the inverse relationship between addition and subtraction is well understood. However, a second possible form of inversion problems can also assess whether participants understand the same inverse relationship between multiplication and division. …
TL;DR: This article presents a class of algorithms for normal basis multiplication in GF(2/sup m/), which requires a significantly lower number of bit level operations and can reduce the space complexity of cryptographic systems.
Abstract: In cryptographic applications, the use of normal bases to represent elements of the finite field GF(2/sup m/) is quite advantageous, especially for hardware implementation. In this article, we consider an important field operation, namely, multiplication which is used in many cryptographic functions. We present a class of algorithms for normal basis multiplication in GF(2/sup m/). Our proposed multiplication algorithm for composite finite fields requires a significantly lower number of bit level operations and, hence, can reduce the space complexity of cryptographic systems.
TL;DR: A novel digit-serial modular multiplier that uses a hybrid architecture to perform the reduction operation needed to reduce the multiplication result: hardwired logic is used for fast reduction of named curves and the multiplier circuit is reused for reduction of generic curves.
Abstract: We describe a cryptographic processor for elliptic curve cryptography (ECC). ECC is evolving as an attractive alternative to other public-key schemes such as RSA by offering the smallest key size and the highest strength per bit. The processor performs point multiplication for elliptic curves over binary polynomial fields GF(2/sup m/). In contrast to other designs that only support one curve at a time, our processor is capable of handling arbitrary curves without requiring reconfiguration. More specifically, it can handle both named curves as standardized by NIST as well as any other generic curves up to a field degree of 255. Efficient support for arbitrary curves is particularly important for the targeted server applications that need to handle requests for secure connections generated by a multitude of heterogeneous client devices. Such requests may specify curves which are infrequently used or not even known at implementation time. Our processor implements 256 bit modular multiplication, division, addition and squaring. The multiplier constitutes the core function as it executes the bulk of the point multiplication algorithm. We present a novel digit-serial modular multiplier that uses a hybrid architecture to perform the reduction operation needed to reduce the multiplication result: hardwired logic is used for fast reduction of named curves and the multiplier circuit is reused for reduction of generic curves. The performance of our FPGA-based prototype, running at a clock frequency of 66.4 MHz, is 6955 point multiplications per second for named curves over GF(2/sup 163/) and 3308 point multiplications per second for generic curves over GF(2/sup 163/).
TL;DR: In this paper, a recursive algorithm is used to iteratively decompose the multiplication into a weighted sum of smaller subproducts, and when the size of the smaller subproduct is less than or equal to a predetermined size, a nonrecursive algorithm may be used to complete the multiplication.
Abstract: Multi-precision multiplication methods over GF(2m) include representing a first polynomial and a second polynomial as an array of n words. A recursive algorithm may be used to iteratively decompose the multiplication into a weighted sum of smaller subproducts. When the size of the smaller subproducts is less than or equal to a predetermined size, a nonrecursive algorithm may be used to complete the multiplication. The nonrecursive algorithm may be optimized to efficiently perform the bottom-end multiplication. For example, pairs of redundant subproducts can be identified and excluded from the nonrecursive algorithm. Moreover, subproducts having weights in a special form may be efficiently calculated by a process that involves storing and reusing intermediate calculations.
TL;DR: In this article, the unification problem of modular multiplication and exponentiation over multiplication is studied and an algorithm for computing strong Grobner bases of right ideals over the polynomial ring Z is proposed.
Abstract: Modular multiplication and exponentiation are common operations in modern cryptography. Unification problems with respect to some equational theories that these operations satisfy are investigated. Two different but related equational theories are analyzed. A unification algorithm is given for one of the theories which relies on solving syzygies over multivariate integral polynomials with noncommuting indeterminates. For the other theory, in which the distributivity property of exponentiation over multiplication is assumed, the unifiability problem is shown to be undecidable by adapting a construction developed by one of the authors to reduce Hilbert's 10th problem to the solvability problem for linear equations over semi-rings. A new algorithm for computing strong Grobner bases of right ideals over the polynomial ring Z is proposed; unlike earlier algorithms proposed by Baader as well as by Madlener and Reinert which work only for right admissible term orderings with the boundedness property, this algorithm works for any right admissible term ordering. The algorithms for some of these unification problems are expected to be integrated into Naval Research Lab.'s Protocol Analyzer (NPA), a tool developed by Catherine Meadows, which has been successfully used to analyze cryptographic protocols, particularly emerging standards such as the Internet Engineering Task Force's (IETF) Internet Key Exchange [11] and Group Domain of Interpretation [12] protocols. Techniques from several different fields - particularly symbolic computation (ideal theory and Groebner basis algorithms) and unification theory - are thus used to address problems arising in state-based cryptographic protocol analysis.
TL;DR: This work focuses on the automatic generation of circuits that involve constant matrix multiplication (CMM), i.e. multiplication of a vector by a constant matrix, and proposes a method based on number recoding and dedicated common sub-expression factorization algorithms for this purpose.
Abstract: We present some improvements on the optimization of hardware multiplication by constant matrices. We focus on the automatic generation of circuits that involve constant matrix multiplication (CMM), i.e. multiplication of a vector by a constant matrix. The proposed method, based on number recoding and dedicated common sub-expression factorization algorithms was implemented in a VHDL generator. The obtained results on several applications have been implemented on FPGAs and compared to previous solutions. Up to 40% area and speed savings are achieved.
TL;DR: A novel modulo (2/sup n/ + 1) addition algorithm leading to an area-time efficient implementation of this arithmetic operation on FPGAs and some improvements of this operator are suggested in order to perform a multiplication in the group (Z*/sub 2n+1/,.).
Abstract: This paper is devoted to the study of number representations and algorithms leading to efficient implementations of modular adders and multipliers on recent field programmable arrays. Our hardware operators take advantage of the building blocks available in such devices: carry-propagate adders, memory blocks, and sometimes embedded multipliers. The first part of the paper describes three basic methodologies to carry out a modulo m addition and presents in more details the design of modulo (2/sup n/ /spl plusmn/ 1) adders. The major result is a novel modulo (2/sup n/ + 1) addition algorithm leading to an area-time efficient implementation of this arithmetic operation on FPGAs. The second part describes a modulo m multiplication algorithm involving small multipliers and memory blocks, and modulo (2/sup n/ + 1) multipliers based on Ma's algorithm. We also suggest some improvements of this operator in order to perform a multiplication in the group (Z*/sub 2n+1/,.).
TL;DR: In this paper, a solid state imager consists of an image area, an output register which receives signal charge from the image area and a separate multiplication register into which signal charge is transferred from the output register.
Abstract: A solid state imager arrangement includes an image area, an output register which receives signal charge from the image area, a separate multiplication register into which signal charge from the output register is transferred, means for obtaining signal charge multiplication by transferring the charge through a sufficiently high field in elements of the multiplication register, and an additional register into which excess signal charge is transferred.
TL;DR: In this paper, weak multiplication modules are characterized and a weak multiplication algorithm is proposed to characterize the weak multiplication module, which can be used to solve weak multiplication problems in weak multiplication.
Abstract: In this paper we characterize weak multiplication modules.
TL;DR: This paper presents the design of parameterized fixed-point integer multiplication, squaring and fractional division units targeted at the Virtex-II family of FPGAs from Xilinx and are based on the small 18X18-bit multiplier blocks.
Abstract: This paper presents the design of parameterized fixed-point integer multiplication, squaring and fractional division units. The units are targeted at the Virtex-II family of FPGAs (field programmable gate arrays) from Xilinx and are based on the small 18X18-bit multiplier blocks. New partial product creation and summation techniques that exploit the low level primitives are used that achieve a 20% area and a 30% delay reduction for multiplication. A dedicated squaring component is presented that offers substantial area savings of up to 50%. The division component uses the multipliers for pre-scaling to reduce the delay and complexity of each minimally redundant radix-8 stage.
TL;DR: Multi-precision multiplication methods can be used in a variety of different applications (e.g., cryptography) and can be implemented in a number of software or hardware environments.
Abstract: Multi-precision multiplication methods include storing a first operand and a second operand as a first array and a second array of n words. A first weighted sum is determined from multiple subproducts of corresponding words of the first operand and the second operand. The methods may further include iteratively determining a next weighted sum from a previous weighted sum and a recursively calculated intermediate product. The disclosed methods can be used in a variety of different applications (e.g., cryptography) and can be implemented in a number of software or hardware environments.
TL;DR: This work considers rings modulo trinomials and 4-term polynomials, and shows that the multiplier is faster than multipliers over elements in a finite field defined by irreducible pentanomials.
Abstract: Elements of a finite field, GF(2/sup m/), are represented as elements in a ring in which multiplication is more time efficient. This leads to faster multipliers with a modest increase in the number of XOR and AND gates needed to construct the multiplier. Such multipliers are used in error control coding and cryptography. We consider rings modulo trinomials and 4-term polynomials. In each case, we show that our multiplier is faster than multipliers over elements in a finite field defined by irreducible pentanomials. These results are especially significant in the field of elliptic curve cryptography, where pentanomials are used to define finite fields. Finally, an efficient systolic implementation of a multiplier for elements in a ring defined by x/sup n/+x+1 is presented.
TL;DR: In this article, the generalized Mersenne numbers are expressed in a polynomial form, p = f(t), where t is a power of 2, and it is shown that such p's lead to fast modular reduction methods which use only a few integer additions and subtractions.
Abstract: In 1999, Jerome Solinas introduced families of moduli called the generalized Mersenne numbers. The generalized Mersenne numbers are expressed in a polynomial form, p = f(t), where t is a power of 2. It is shown that such p’s lead to fast modular reduction methods which use only a few integer additions and subtractions. We further generalize this idea by allowing any integer for t. We show that more generalized Mersenne numbers still lead to a significant improvement over well-known modular multiplication techniques. While each generalized Mersenne number requires a dedicated implementation, more generalized Mersenne numbers allow flexible implementations that work for more than one modulus. We also show that it is possible to perform long integer modular arithmetic without using multiple precision operations when t is chosen properly. Moreover, based on our results, we propose efficient arithmetic methods for XTR cryptosystem.
TL;DR: In this paper, an example of a matrix multiplication method that reduces calculation times on SIMD processors is described, which requires loading each diagonal of the multiplicand matrix c into a different register of a processor, and loading a multiplier matrix a into at least one register in column order.
Abstract: An example of a matrix multiplication method that reduces calculation times on SIMD processors is described. The matrix multiplication requires loading each diagonal of the multiplicand matrix c into a different register of a processor, and loading a multiplier matrix a into at least one register in column order. Multiplication and addition elements in each column of multiplier matrix a in the register are selectively shifted to by shifting one element, with the last element of a column shifted to the front of the column. Diagonals of the multiplicand c matrix are multiplied by columns of the multiplier a matrix, with their product being added to the sum of products for columns of a result matrix.
TL;DR: This work proposes efficient algorithms for common-multiplicand multiplications (CMM) and exponentiations of large integers with large modulus and shows how the Hamming weight of an integer can be reduced by performing complements.
Abstract: The multiplications of common multiplicands and exponentiations of large integers with large modulus are the primary computation operations in several well-known public key cryptosystems. The Hamming weight of the multiplier or the exponent plays an important role for computation efficiency. By performing complements, the Hamming weight of an integer can be reduced. Based on this concept, we propose efficient algorithms for common-multiplicand multiplications (CMM) and exponentiations. In the average case, it takes k/2+2/spl times/log(k)+5 k-bit additions to compute the CMM. For exponentiation, the proposed method takes 5k/4+2 multiplications on average, but the pre-computation for a modular multiplicative inverse is required. Combining the original CMM, the number of multiplications can further be reduced to 9k/8+2.
TL;DR: An SCA-resistant scalar multiplication method that is allowed to take any number of pre-computed points that is different from the other schemes designed for resisting the DPA.
Abstract: Elliptic curve cryptosystem (ECC) is well-suited for the implementation on memory constraint environments due to its small key size. However, side channel attacks (SCA) can break the secret key of ECC on such devices, if the implementation method is not carefully considered. The scalar multiplication of ECC is particularly vulnerable to the SCA. In this paper we propose an SCA-resistant scalar multiplication method that is allowed to take any number of pre-computed points. The proposed scheme essentially intends to resist the simple power analysis (SPA), not the differential power analysis (DPA). Therefore it is different from the other schemes designed for resisting the DPA. The previous SPA-countermeasures based on window methods utilize the fixed pattern windows, so that they only take discrete table size. The optimal size is 2 w − 1 for w=2,3,..., which was proposed by Okeya and Takagi. We play a different approach from them. The key idea is randomly (but with fixed probability) to generate two different patterns based on pre-computed points. The two distributions are indistinguishable from the view point of the SPA. The proposed probabilistic scheme provides us more flexibility for generating the pre-computed points — the designer of smart cards can freely choose the table size without restraint.