Top 327 papers published in the topic of Multiplication in 2010

Showing papers on "Multiplication published in 2010"

Book Chapter•10.1007/978-3-642-13190-5_2•

Fully homomorphic encryption over the integers

[...]

Marten van Dijk¹, Craig Gentry², Shai Halevi², Vinod Vaikuntanathan²•Institutions (2)

Massachusetts Institute of Technology¹, IBM²

30 May 2010

TL;DR: A fully homomorphic encryption scheme, using only elementary modular arithmetic, that reduces the security of the scheme to finding an approximate integer gcd, and investigates the hardness of this task, building on earlier work of Howgrave-Graham.

...read moreread less

Abstract: We construct a simple fully homomorphic encryption scheme, using only elementary modular arithmetic. We use Gentry’s technique to construct a fully homomorphic scheme from a “bootstrappable” somewhat homomorphic scheme. However, instead of using ideal lattices over a polynomial ring, our bootstrappable encryption scheme merely uses addition and multiplication over the integers. The main appeal of our scheme is the conceptual simplicity. We reduce the security of our scheme to finding an approximate integer gcd – i.e., given a list of integers that are near-multiples of a hidden integer, output that hidden integer. We investigate the hardness of this task, building on earlier work of Howgrave-Graham.

...read moreread less

2,001 citations

Dissertation•

On the complexity of matrix multiplication

[...]

Andrew James Stothers

1 Jan 2010

TL;DR: In this article, it was shown that ω ≤ log 2 7 < 2.8074, which is better than the value of 3 we had previously, and showed how cubing and raising to the fourth power of Coppersmith and Winograd's complicated algorithm can improve the precision of matrix multiplication.

...read moreread less

Abstract: The evaluation of the product of two matrices can be very computationally expensive. The multiplication of two n×n matrices, using the “default” algorithm can take O(n3) field operations in the underlying field k. It is therefore desirable to find algorithms to reduce the “cost” of multiplying two matrices together. If multiplication of two n× n matrices can be obtained in O(nα) operations, the least upper bound for α is called the exponent of matrix multiplication and is denoted by ω. A bound for ω < 3 was found in 1968 by Strassen in his algorithm. He found that multiplication of two 2× 2 matrices could be obtained in 7 multiplications in the underlying field k, as opposed to the 8 required to do the same multiplication previously. Using recursion, we are able to show that ω ≤ log2 7 < 2.8074, which is better than the value of 3 we had previously. In chapter 1, we look at various techniques that have been found for reducing ω. These include Pan’s Trilinear Aggregation, Bini’s Border Rank and Schonhage’s Asymptotic Sum inequality. In chapter 2, we look in detail at the current best estimate of ω found by Coppersmith and Winograd. We also propose a different method of evaluating the “value” of trilinear forms. Chapters 3 and 4 build on the work of Coppersmith and Winograd and examine how cubing and raising to the fourth power of Coppersmith and Winograd’s “complicated” algorithm affect the value of ω, if at all. Finally, in chapter 5, we look at the Group-Theoretic context proposed by Cohn and Umans, and see how we can derive some of Coppersmith and Winograd’s values using this method, as well as showing how working in this context can perhaps be more conducive to showing ω = 2.

...read moreread less

245 citations

Book•

Modern Computer Arithmetic

[...]

Richard P. Brent, Paul Zimmermann

27 Dec 2010

TL;DR: Brent and Zimmermann as discussed by the authors present algorithms that are ready to implement in your favorite language, while keeping a high-level description and avoiding too low-level or machine-dependent details.

...read moreread less

Abstract: Modern Computer Arithmetic focuses on arbitrary-precision algorithms for efficiently performing arithmetic operations such as addition, multiplication and division, and their connections to topics such as modular arithmetic, greatest common divisors, the Fast Fourier Transform (FFT), and the computation of elementary and special functions. Brent and Zimmermann present algorithms that are ready to implement in your favorite language, while keeping a high-level description and avoiding too low-level or machine-dependent details. The book is intended for anyone interested in the design and implementation of efficient high-precision algorithms for computer arithmetic, and more generally efficient multiple-precision numerical algorithms. It may also be used in a graduate course in mathematics or computer science, for which exercises are included. These vary considerably in difficulty, from easy to small research projects, and expand on topics discussed in the text. Solutions are available from the authors.

...read moreread less

210 citations

Journal Article•10.5121/IJCNC.2010.2405•

Delay-power performance comparison of multipliers in vlsi circuit design

[...]

Sumit R. Vaidya, Deepak R. Dandekar

09 Jul 2010-International Journal of Computer Networks & Communications

TL;DR: Comparison study of different multipliers of Ancient Indian Vedic Mathematics is done for low power requirement and high speed to improve the speed, area parameters of multipliers.

...read moreread less

Abstract: A typical processor central processing unit devotes a considerable amount of processing time in performing arithmetic operations, particularly multiplication operations. Multiplication is one of the basic arithmetic operations and it requires substantially more hardware resources and processing time than addition and subtraction. In fact, 8.72% of all the instruction in typical processing units is multiplication. In this paper, comparative study of different multipliers is done for low power requirement and high speed. The paper gives information of “Urdhva Tiryakbhyam” algorithm of Ancient Indian Vedic Mathematics which is utilized for multiplication to improve the speed, area parameters of multipliers. Vedic Mathematics suggests one more formula for multiplication of large number i.e. “Nikhilam Sutra” which can increase the speed of multiplier by reducing the number of iterations.

...read moreread less

143 citations

Patent•

Multipliers with a reduced number of memory blocks

[...]

Colman Cheung¹•Institutions (1)

Altera¹

3 Mar 2010

TL;DR: In this paper, techniques for implementing multipliers using memory blocks in an integrated circuit (IC) are provided. The disclosed techniques may reduce the number of memory blocks required to implement various multiplication operations.

...read moreread less

Abstract: Techniques for implementing multipliers using memory blocks in an integrated circuit (IC) are provided. The disclosed techniques may reduce the number of memory blocks required to implement various multiplication operations. A plurality of generated products is normalized. The normalized products are scaled to generate a plurality of scaled products. Scaled products with the least root mean square (RMS) error are identified. The scaled products with the least RMS error are then stored in a plurality of memory blocks in an IC. The scaled products may have a reduced number of bits compared to the plurality of generated products that have not been normalized and scaled.

...read moreread less

122 citations

Journal Article•10.1016/J.COGNITION.2010.05.003•

Core Multiplication in Childhood

[...]

Koleen McCrink¹, Elizabeth S. Spelke²•Institutions (2)

Columbia University¹, Harvard University²

01 Aug 2010-Cognition

TL;DR: Evidence is provided for an untrained, intuitive process of calculating multiplicative numerical relationships, providing a further foundation for formal arithmetic instruction.

...read moreread less

111 citations

Journal Article•10.3758/MC.38.3.322•

Strategy switch costs in arithmetic problem solving.

[...]

Patrick Lemaire¹, Mireille Lecacheur¹•Institutions (1)

University of Provence¹

01 Apr 2010-Memory & Cognition

TL;DR: These effects were found when the participants executed the easiest strategy and when they solved easy problems, and the implications for models of strategy choices were discussed.

...read moreread less

Abstract: Three experiments tested whether switching between strategies involves a cost. In three experiments, participants had to give approximate products to two-digit multiplication problems (e.g., 47 x 76). They were told which strategy to use (Experiments 1 and 2) or could choose among strategies (Experiment 3). The participants showed poorer performance when they used different strategies on two consecutive trials than when they used the same strategy. They also used the same strategy over two consecutive problems more often than they used different strategies. These effects, termed strategy switch costs, were found when the participants executed the easiest strategy and when they solved easy problems. We discuss possible processes underlying these strategy switch costs and the implications of these strategy switch costs for models of strategy choices.

...read moreread less

106 citations

Journal Article•10.1109/TC.2009.167•

Improved Design of High-Performance Parallel Decimal Multipliers

[...]

Alvaro Vazquez¹, Elisardo Antelo², Paolo Montuschi•Institutions (2)

École normale supérieure de Lyon¹, University of Santiago de Compostela²

01 May 2010-IEEE Transactions on Computers

TL;DR: The proposed architectures of two parallel decimal multipliers have interesting area-delay figures compared to conventional Booth radix-4 and radix--8 parallel binary multipliers and outperform the figures of previous alternatives for decimal multiplication.

...read moreread less

Abstract: The new generation of high-performance decimal floating-point units (DFUs) is demanding efficient implementations of parallel decimal multipliers. In this paper, we describe the architectures of two parallel decimal multipliers. The parallel generation of partial products is performed using signed-digit radix-10 or radix-5 recodings of the multiplier and a simplified set of multiplicand multiples. The reduction of partial products is implemented in a tree structure based on a decimal multioperand carry-save addition algorithm that uses unconventional (non BCD) decimal-coded number systems. We further detail these techniques and present the new improvements to reduce the latency of the previous designs, which include: optimized digit recoders for the generation of 2n-tuples (and 5-tuples), decimal carry-save adders (CSAs) combining different decimal-coded operands, and carry-free adders implemented by special designed bit counters. Moreover, we detail a design methodology that combines all these techniques to obtain efficient reduction trees with different area and delay trade-offs for any number of partial products generated. Evaluation results for 16-digit operands show that the proposed architectures have interesting area-delay figures compared to conventional Booth radix-4 and radix--8 parallel binary multipliers and outperform the figures of previous alternatives for decimal multiplication.

...read moreread less

99 citations

Journal Article•10.3934/AMC.2010.4.169•

Efficient implementation of elliptic curve cryptography in wireless sensors

[...]

Diego F. Aranha, Ricardo Dahab, Julio López, Leonardo B. Oliveira

01 May 2010-Advances in Mathematics of Communications

TL;DR: The results strongly indicate that binary curves are the most efficient alternative for the implementation of elliptic curve cryptography in the MICAz Mote, a popular sensor platform.

...read moreread less

Abstract: The deployment of cryptography in sensor networks is a challenging task, given the limited computational power and the resource-constrained nature of the sensoring devices. This paper presents the implementation of elliptic curve cryptography in the MICAz Mote, a popular sensor platform. We present optimization techniques for arithmetic in binary fields, including squaring, multiplication and modular reduction at two different security levels. Our implementation of field multiplication and modular reduction algorithms focuses on the reduction of memory accesses and appears as the fastest result for this platform. Finite field arithmetic was implemented in C and Assembly and elliptic curve arithmetic was implemented in Koblitz and generic binary curves. We illustrate the performance of our implementation with timings for key agreement and digital signature protocols. In particular, a key agreement can be computed in 0.40 seconds and a digital signature can be computed and verified in 1 second at the 163-bit security level. Our results strongly indicate that binary curves are the most efficient alternative for the implementation of elliptic curve cryptography in this platform.

...read moreread less

74 citations

Proceedings Article•10.1109/IJCNN.2010.5596736•

Commutative quaternion and multistate Hopfield neural networks

[...]

Teijiro Isokawa¹, Haruhiko Nishimura¹, Nobuyuki Matsui¹•Institutions (1)

University of Hyogo¹

18 Jul 2010

TL;DR: Two types of multistate Hopfield neural networks, based on commutative quaternion that are similar to Hamilton's quaternions but with Commutative multiplication are explored, i.e., the energies monotonically decreases with respect to the changes of the network states.

...read moreread less

Abstract: This paper explores two types of multistate Hopfield neural networks, based on commutative quaternions that are similar to Hamilton's quaternions but with commutative multiplication. In one type of the networks, the state of a neuron is represented by two kinds of phases and one real number. The other type of the networks adopts the decomposed form of commutative quaternion, i.e., the state of a neuron consists of a combination of two complex values. We have investigated the stabilities of these networks, i.e., the energies monotonically decreases with respect to the changes of the network states.

...read moreread less

70 citations

Posted Content•

Memristor-based Circuits for Performing Basic Arithmetic Operations

[...]

Farnood Merrikh-Bayat¹, Saeed Bagheri Shouraki¹•Institutions (1)

Sharif University of Technology¹

20 Aug 2010-arXiv: Hardware Architecture

TL;DR: In this paper, the memristance of the newly found circuit element is used to represent signals instead of voltages or currents, and a new circuit is designed for programming the memory of a memristor with a predetermined analog value.

...read moreread less

Abstract: In almost all of the currently working circuits, especially in analog circuits implementing signal processing applications, basic arithmetic operations such as multiplication, addition, subtraction and division are performed on values which are represented by voltages or currents. However, in this paper, we propose a new and simple method for performing analog arithmetic operations which in this scheme, signals are represented and stored through a memristance of the newly found circuit element, i.e. memristor, instead of voltage or current. Some of these operators such as divider and multiplier are much simpler and faster than their equivalent voltage-based circuits and they require less chip area. In addition, a new circuit is designed for programming the memristance of the memristor with predetermined analog value. Presented simulation results demonstrate the effectiveness and the accuracy of the proposed circuits.

...read moreread less

Journal Article•10.1016/J.CNSNS.2009.03.022•

A chaos secure communication scheme based on multiplication modulation

[...]

Kia Fallahi¹, Henry Leung¹•Institutions (1)

University of Calgary¹

01 Feb 2010-Communications in Nonlinear Science and Numerical Simulation

TL;DR: A secure spread spectrum communication scheme using multiplication modulation that multiplies the message by chaotic signal lends itself to cheap implementation and can therefore be used effectively for ensuring security and privacy in commercial consumer electronics products.

...read moreread less

Proceedings Article•10.1109/ACSSC.2010.5757715•

Memristor-based arithmetic

[...]

K'andrea C. Bickerstaff, Earl E. Swartzlander¹•Institutions (1)

University of Texas at Austin¹

1 Dec 2010

TL;DR: An overview of both analog and digital approaches offered in the literature for addition and multiplication will be described, and Memristor-based designs of an adder and a multiplier are presented.

...read moreread less

Abstract: This paper describes strategies for performing arithmetic operations in memristor-based structures An overview of both analog and digital approaches offered in the literature for addition and multiplication will be described Memristor-based designs of an adder and a multiplier are presented

...read moreread less

Book Chapter•10.1007/978-3-642-14712-8_9•

Efficient software implementation of binary field arithmetic using vector instruction sets

[...]

Diego F. Aranha¹, Julio López¹, Darrel Hankerson²•Institutions (2)

State University of Campinas¹, Auburn University²

8 Aug 2010

TL;DR: An efficient software implementation of characteristic 2 fields making extensive use of vector instruction sets commonly found in desktop processors and follows the trend of accelerating implementations of cryptography through PTLU-style instructions is described.

...read moreread less

Abstract: In this paper we describe an efficient software implementation of characteristic 2 fields making extensive use of vector instruction sets commonly found in desktop processors. Field elements are represented in a split form so performance-critical field operations can be formulated in terms of simple operations over 4-bit sets. In particular, we detail techniques for implementing field multiplication, squaring, square root extraction and present a constant-memory lookup-based multiplication strategy. Our representation makes extensive use of the parallel table lookup (PTLU) instruction recently introduced in popular desktop platforms and follows the trend of accelerating implementations of cryptography through PTLU-style instructions. We present timings for several binary fields commonly employed for curve-based cryptography and illustrate the presented techniques with executions of the ECDH and ECDSA protocols over binary curves at the 128-bit and 256-bit security levels standardized by NIST. Our implementation results are compared with publicly available benchmarking data.

...read moreread less

Journal Article•10.1112/JTOPOL/JTS032•

The Multiplication on BP

[...]

Maria Basterra, Michael A. Mandell¹•Institutions (1)

Indiana University¹

30 Dec 2010-arXiv: Algebraic Topology

TL;DR: BP is an E4 ring spectrum as discussed by the authors, and the E4 structure is unique up to automorphism, and it can be used to detect anomalies in the spectrum and to identify anomalies.

...read moreread less

Abstract: BP is an E4 ring spectrum. The E4 structure is unique up to automorphism.

...read moreread less

Book•

Matters Computational: Ideas, Algorithms, Source Code

[...]

Jörg Arndt

20 Oct 2010

TL;DR: Low level algorithms as mentioned in this paper use bit wizardry and permutations and their operations to find paths in directed graphs and search paths for directed graphs in directed graph graphs, using the GP language.

...read moreread less

Abstract: Low level algorithms.- Bit wizardry.- Permutations and their operations.- Sorting and searching.- Data structures.- Combinatorial generation.- Conventions and considerations.- Combinations.- Compositions.- Subsets.- Mixed radix numbers.- Permutations.- Multisets.- Gray codes for string with restrictions.- Parenthesis strings.- Integer partitions.- Set partitions.- Necklaces and Lyndon words.- Hadamard and conference matrices.- Searching paths in directed graphs.- Fast transforms.- The Fourier transform.- Convolution, correlation, and more FFT algorithms.- The Walsh transform and its relatives.- The Haar transform.- The Hartley transform.- Number theoretic transforms (NTTs).- Fast wavelet transforms.- Fast arithmetic.- Fast multiplication and exponentiation.- Root extraction.- Iterations for the inversion of a function.- The AGM, elliptic integrals, and algorithms for computing.- Logarithm and exponential function.- Computing the elementary functions with limited resources.- Numerical evaluation of power series.- Cyclotomic polynomials, product forms, and continued fractions.- Synthetic Iterations.-. Algorithms for finite fields.- Modular arithmetic and some number theory.- Binary polynomials.- Shift registers.- Binary finite fields.- The electronic version of the book.- Machine used for benchmarking.- The GP language.- Bibliography.- Index.

...read moreread less

Journal Article•10.1016/J.JCO.2009.11.002•

On multiplication in finite fields

[...]

Murat Cenk¹, Ferruh Özbudak²•Institutions (2)

Çankaya University¹, Middle East Technical University²

01 Apr 2010-Journal of Complexity

TL;DR: The method generalizes some earlier methods and combines them with the recently introduced complexity notion [email protected]^"q(@?), which denotes the minimum number of multiplications needed to obtain the coefficients of the product of two arbitrary @?-term polynomials modulo x^@? in F"q[x].

...read moreread less

Proceedings Article•10.1109/ISCAS.2010.5537658•

Minimal Logic Depth adder tree optimization for Multiple Constant Multiplication

[...]

Mathias Faust¹, Chip-Hong Chang¹•Institutions (1)

Nanyang Technological University¹

3 Aug 2010

TL;DR: A minimal logic depth GD algorithm which requires no lookup table and consumes less switching power than the latest LD constrained GD methods based on the Glitch Path Count and Glitches Path Score metrics.

...read moreread less

Abstract: Research on optimization of fixed coefficient FIR filters modeled as Multiple Constant Multiplication (MCM) has been ongoing for two decades. An analysis of Minimal Signed Digit (MSD) reveals that potential good solutions are omitted by Common Subexpression Elimination (CSE) algorithms as they are hidden in the MSD representations. Some CSE algorithms ensure that all coefficients are implemented at minimal Logic Depth (LD) which is advantageous from power saving perspective. Imposing this requirement on a graph dependant (GD) algorithm reduces the search space as well as the runtime. It also eliminates the long critical path of GD algorithm. This paper presents a minimal logic depth GD algorithm which requires no lookup table. Simulation results show that it has lower number of adders than CSE algorithms while having the minimal logic depth. For all filters tested, it consumes less switching power than the latest LD constrained GD methods based on the Glitch Path Count and Glitch Path Score metrics.

...read moreread less

Journal Article•10.1002/PIP.943•

A new analog MPPT technique: TEODI

[...]

Nicola Femia¹, Giovanni Petrone¹, Giovanni Spagnuolo¹, Massimo Vitelli•Institutions (1)

University of Salerno¹

01 Jan 2010-Progress in Photovoltaics

TL;DR: A numerical analysis, carried out by using the Perturb and Observe MPPT technique as a benchmark reference, confirms the validity of the proposed approach.

...read moreread less

Abstract: A new analog maximum power point tracking (MPPT) technique is presented and discussed. Its main advantages are simplicity of implementation, absence of memory and multiplication operations, and the high MPPT efficiency obtainable under both stationary and time-varying atmospheric conditions. A numerical analysis, carried out by using the Perturb and Observe MPPT technique as a benchmark reference, confirms the validity of the proposed approach. Copyright © 2009 John Wiley & Sons, Ltd.

...read moreread less

Journal Article•10.1364/AO.49.002352•

Carry-free vector-matrix multiplication on a dynamically reconfigurable optical platform

[...]

Xianchao Wang¹, Junjie Peng¹, Mei Li², Zhangyi Shen¹, Ouyang Shan¹ - Show less +1 more•Institutions (2)

Shanghai University¹, Northwestern Polytechnical University²

20 Apr 2010-Applied Optics

TL;DR: This work investigates a novel optical VMM (OVMM) using five logic operations with the modified signed-digit (MSD) number system and proposes a new implementation method that can be used to realize the MSD multiplication in parallel.

...read moreread less

Abstract: Applying the parallelism of optical computing, we present a novel method of vector-matrix multiplication (VMM) based on a new optical computing platform, the ternary optical computer, which can reconfigure any two-input trivalued logic optical processor at runtime, according to the decrease-radix design principle. In this work, we investigate a novel optical VMM (OVMM) using five logic operations with the modified signed-digit (MSD) number system. To simplify the computation process, we realize a carry-free optical addition in three steps, which is independent of the length of the operands. And a new implementation method is proposed that can be used to realize the MSD multiplication in parallel. Based on the generation of partial products in parallel and the binary-addition-tree algorithm, the multiplication can be implemented with the MSD addition. Our initial experiments have been performed to verify the proposed OVMM method. The results show that the proposed method of OVMM is feasible and correct.

...read moreread less

Book•

Finite Precision Number Systems and Arithmetic

[...]

Peter Kornerup¹, David W. Matula²•Institutions (2)

University of Southern Denmark¹, Southern Methodist University²

8 Nov 2010

TL;DR: This comprehensive reference provides researchers with the thorough understanding of number representations that is a necessary foundation for designing efficient arithmetic algorithms.

...read moreread less

Abstract: Fundamental arithmetic operations support virtually all of the engineering, scientific, and financial computations required for practical applications, from cryptography, to financial planning, to rocket science. This comprehensive reference provides researchers with the thorough understanding of number representations that is a necessary foundation for designing efficient arithmetic algorithms. Using the elementary foundations of radix number systems as a basis for arithmetic, the authors develop and compare alternative algorithms for the fundamental operations of addition, multiplication, division, and square root with precisely defined roundings. Various finite precision number systems are investigated, with the focus on comparative analysis of practically efficient algorithms for closed arithmetic operations over these systems. Each chapter begins with an introduction to its contents and ends with bibliographic notes and an extensive bibliography. The book may also be used for graduate teaching: problems and exercises are scattered throughout the text and a solutions manual is available for instructors.

...read moreread less

Book•

Global Church Planting: Biblical Principles and Best Practices for Multiplication

[...]

Craig Ott

1 Dec 2010

TL;DR: In this paper, the authors present a volume about the various aspects of Church Placement, and the authors describe how to read it in one volume and how to selliered through it.

...read moreread less

Abstract: Wow! This book seems to have it all. Has there ever been so much written in one volume about the various aspects of church planting? Just holding the volume in my hand was a bit intimidating. Nearly everyone who has written about church planting in recent years is either quoted or alluded to in this work. OK so I was a little overwhelmed in the beginning. However, I took heart, began at the beginning and soldiered through. Below are my thoughts as I read.

...read moreread less

Journal Article•10.4218/ETRIJ.10.0109.0232•

A Low-Complexity 128-Point Mixed-Radix FFT Processor for MB-OFDM UWB Systems

[...]

Sang-In Cho, Kyu-Min Kang

05 Feb 2010-Etri Journal

TL;DR: The implementation results show that the proposed 128‐point mixed‐radix FFT architecture significantly reduces the hardware cost and power consumption in comparison to existing 128‐ point FFT architectures.

...read moreread less

Abstract: In this paper, we present a fast Fourier transform (FFT) processor with four parallel data paths for multiband orthogonal frequency-division multiplexing ultrawideband systems. The proposed 128-point FFT processor employs both a modified radix-2 4 algorithm and a radix-2 3 algorithm to significantly reduce the numbers of complex constant multipliers and complex booth multipliers. It also employs substructure-sharing multiplication units instead of constant multipliers to efficiently conduct multiplication operations with only addition and shift operations. The proposed FFT processor is implemented and tested using 0.18 µm CMOS technology with a supply voltage of 1.8 V. The hardware- efficient 128-point FFT processor with four data streams can support a data processing rate of up to 1 Gsample/s while consuming 112 mW. The implementation results show that the proposed 128-point mixed-radix FFT architecture significantly reduces the hardware cost and power consumption in comparison to existing 128-point FFT architectures.

...read moreread less

Proceedings Article•10.1109/IC-NC.2010.56•

An RSA Encryption Hardware Algorithm Using a Single DSP Block and a Single Block RAM on the FPGA

[...]

Bo Song¹, Kensuke Kawakami¹, Koji Nakano¹, Yasuaki Ito¹•Institutions (1)

Hiroshima University¹

17 Nov 2010

TL;DR: Since the circuit uses only one DSP48E1 block and one Block RAM, the implementation is close to optimal in the sense that it has only less than 3% overhead in multiplication and no further improvement is possible as long as Montgomery multiplication based algorithm is used.

...read moreread less

Abstract: The main contribution of this paper is to present an efficient hardware algorithm for RSA encryption/decryption based on Montgomery multiplication. Modern FPGAs have a number of embedded DSP blocks (DSP48E1) and embedded memory blocks (BRAM). Our hardware algorithm supporting 2048-bit RSA encryption/decryption is designed to be implemented using one DSP48E1, one BRAM and few logic blocks (slices) in the Xilinx Virtex-6 family FPGA. The implementation results showed that our RSA module for 2048-bit RSA encryption/decryption runs in 277.26ms. Quite surprisingly, the multiplier in DSP48E1 used to compute Montgomery multiplication works in more than 97% clock cycles over all clock cycles. Hence, our implementation is close to optimal in the sense that it has only less than 3% overhead in multiplication and no further improvement is possible as long as Montgomery multiplication based algorithm is used. Also, since our circuit uses only one DSP48E1 block and one Block RAM, we can implement a number of RSA modules in an FPGA that can work in parallel to attain high throughput RSA encryption/decryption.

...read moreread less

Proceedings Article•10.1145/1837210.1837224•

Exact sparse matrix-vector multiplication on GPU's and multicore architectures

[...]

Brice Boyer¹, Jean-Guillaume Dumas¹, Pascal Giorgi²•Institutions (2)

University of Grenoble¹, University of Montpellier²

21 Jul 2010

TL;DR: This work proposes different implementations of the sparse matrix-dense vector multiplication (SpMV) for finite fields and rings Z /m Z and uses this library and a new parallelisation of the sigma-basis algorithm in a parallel block Wiedemann rank implementation over finite fields.

...read moreread less

Abstract: We propose different implementations of the sparse matrix-dense vector multiplication (SpMV) for finite fields and rings Z /mZ. We take advantage of graphic card processors (GPU) and multi-core architectures. Our aim is to improve the speed of SpMV in the LinBox library, and henceforth the speed of its black-box algorithms. Besides, we use this library and a new parallelisation of the sigma-basis algorithm in a parallel block Wiedemann rank implementation over finite fields.

...read moreread less

Journal Article•

A protocol for multiplication and restoration of Ceropegia fantastica Sedgw.: a critically endangered plant species.

[...]

A. N. Chandore, Mansingraj S. Nimbalkar, Rajaram V. Gurav, Vishwas A. Bapat, Shrirang Ramchandra Yadav - Show less +1 more

01 Jan 2010-Current Science

Journal Article•10.1016/J.AMC.2010.02.009•

A unified framework for interpolating and approximating univariate subdivision

[...]

Carolina Vittoria Beccari¹, Giulio Casciola¹, Lucia Romani²•Institutions (2)

University of Bologna¹, University of Milano-Bicocca²

01 Apr 2010-Applied Mathematics and Computation

TL;DR: This paper shows that the refinement rules of interpolating and approximating univariate subdivision schemes with odd-width masks of finite support can be derived ones from the others by simple operations on the mask coefficients, and provides a constructive method for the definition of novel refinement algorithms.

...read moreread less

Journal Article•10.1007/S10207-009-0099-9•

Counting equations in algebraic attacks on block ciphers

[...]

Lars R. Knudsen¹, Charlotte V. Miolane¹•Institutions (1)

Technical University of Denmark¹

01 Apr 2010-International Journal of Information Security

TL;DR: It is shown that by splitting the equations defined over a block cipher (an SP-network) into two sets, one can determine the exact number of linearly independent equations which can be generated in algebraic attacks within each of these sets of a certain degree.

...read moreread less

Abstract: This paper is about counting linearly independent equations for so-called algebraic attacks on block ciphers. The basic idea behind many of these approaches, e.g., XL, is to generate a large set of equations from an initial set of equations by multiplication of existing equations by the variables in the system. One of the most difficult tasks is to determine the exact number of linearly independent equations one obtain in the attacks. In this paper, it is shown that by splitting the equations defined over a block cipher (an SP-network) into two sets, one can determine the exact number of linearly independent equations which can be generated in algebraic attacks within each of these sets of a certain degree. While this does not give us a direct formula for the success of algebraic attacks on block ciphers, it gives some interesting bounds on the number of equations one can obtain from a given block cipher. Our results are applied to the AES and to a variant of the AES, and the exact numbers of linearly independent equations in the two sets that one can generate by multiplication of an initial set of equations are given. Our results also indicate, in a novel way, that the AES is not vulnerable to the algebraic attacks as defined here.

...read moreread less

Proceedings Article•10.1109/ICETC.2010.5529724•

Optimizing sparse matrix-vector multiplication on CUDA

[...]

Zhuowei Wang¹, Xianbin Xu¹, Wuqing Zhao¹, Yuping Zhang¹, Shuibing He¹ - Show less +1 more•Institutions (1)

Wuhan University¹

22 Jun 2010

TL;DR: Three optimizations include: (1) optimized CSR storage format, (2) optimized threads mapping, and (3) avoiding divergence judgment.

...read moreread less

Abstract: in recent years, GPUs have attracted the attention of many application developers as powerful massively parallel system. CUDA as a general purpose parallel computing architecture make GPUs an appealing choice to solve many complex computational problems in a more efficient way. In this paper, we discuss implementing optimizing spare matrix-vector multiplication on NVIDIA GPUs using CUDA programming model. We outline three optimizations include: (1) optimized CSR storage format, (2) optimized threads mapping, and (3) avoiding divergence judgment. We experimentally evaluate our optimizations on GeForce 9600 GTX, connect to Windows xp 64-bit system. In comparison with NVIDIA's SpMV library and NVIDIA's CUDDPA library, the results show that optimizing sparse matrix-vector multiplication on CUDA achieves better performance than other SpMV implementations.

...read moreread less

Journal Article•

The Differential Analysis of S-functions

[...]

Nicky Mouha¹, Vesselin Velichkov¹, Christophe De Cannière¹, Bart Preneel¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 Jan 2010-Lecture Notes in Computer Science

TL;DR: In this paper, the authors introduced the concept of S-functions, which is a function that calculates the output bit using only the inputs of the i-th position and a finite state S[i].

...read moreread less

Abstract: An increasing number of cryptographic primitives use operations such as addition modulo 2n, multiplication by a constant and bitwise Boolean functions as a source of non-linearity. In NIST's SHA-3 competition, this applies to 6 out of the 14 second-round candidates. In this paper, we generalize such constructions by introducing the concept of S-functions. An S-function is a function that calculates the i-th output bit using only the inputs of the i-th bit position and a finite state S[i]. Although S-functions have been analyzed before, this paper is the first to present a fully general and efficient framework to determine their differential properties. A precursor of this framework was used in the cryptanalysis of SHA-1. We show how to calculate the probability that given input differences lead to given output differences, as well as how to count the number of output differences with non-zero probability. Our methods are rooted in graph theory, and the calculations can be efficiently performed using matrix multiplications.

...read moreread less

...

Expand