TL;DR: A new TOPSIS approach for selecting plant location under linguistic environments is presented, where the ratings of various alternative locations under various criteria, and the weights of various criteria are assessed in linguistic terms represented by fuzzy numbers.
Abstract: The selection of plant location plays a very important role in minimizing cost and maximizing the use of resources for many companies. In this paper, a new TOPSIS approach for selecting plant location under linguistic environments is presented, where the ratings of various alternative locations under various criteria, and the weights of various criteria are assessed in linguistic terms represented by fuzzy numbers. To avoid complicated fuzzy arithmetic operations, the linguistic variables, which are represented by triangular fuzzy numbers, are transformed into crisp numbers based on graded mean representation. The canonical representation of multiplication operations on triangular fuzzy numbers is used to obtain the “positive ideal solution” and the “negative ideal solution”. The closeness efficient is defined to determine the ranking order of all alternatives by calculating the distance to both the “positive-ideal solution” and the “negative-ideal solution” simultaneously. Compared with existing fuzzy TOPSIS methods, the proposed method can deal with group decision-making problems in a more efficient manner. A numerical example of plant location selection is used to illustrate the efficiency of the proposed method.
TL;DR: The author's research focused on the development of a number representation system that allowed for the addition and subtraction of numbers up to and including the number of bits in a discrete-time system.
Abstract: Preface. About the Authors. 1. Introduction. 1.1 Number Representation. 1.2 Algorithms. 1.3 Hardware Platforms. 1.4 Hardware-Software Partitioning. 1.5 Software Generation. 1.6 Synthesis. 1.7 A First Example. 1.7.1 Specification. 1.7.2 Number Representation. 1.7.3 Algorithms. 1.7.4 Hardware Platform. 1.7.5 Hardware-Software Partitioning. 1.7.6 Program Generation. 1.7.7 Synthesis. 1.7.8 Prototype. 1.8 Bibliography. 2. Mathematical Background. 2.1 Number Theory. 2.1.1 Basic Definitions. 2.1.2 Euclidean Algorithms. 2.1.3 Congruences. 2.2 Algebra. 2.2.1 Groups. 2.2.2 Rings. 2.2.3 Fields. 2.2.4 Polynomial Rings. 2.2.5 Congruences of Polynomial. 2.3 Function Approximation. 2.4 Bibliography. 3. Number Representation. 3.1 Natural Numbers. 3.1.1 Weighted Systems. 3.1.2 Residue Number System. 3.2 Integers. 3.2.1 Sign-Magnitude Representation. 3.2.2 Excess-E Representation. 3.2.3 B's Complement Representation. 3.2.4 Booth's Encoding. 3.3 Real Numbers. 3.4 Bibliography. 4. Arithmetic Operations: Addition and Subtraction. 4.1 Addition of Natural Numbers. 4.1.1 Basic Algorithm. 4.1.2 Faster Algorithms. 4.1.3 Long-Operand Addition. 4.1.4 Multioperand Addition. 4.1.5 Long-Multioperand Addition. 4.2 Subtraction of Natural Numbers. 4.3 Integers. 4.3.1 B's Complement Addition. 4.3.2 B's Complement Sign Change. 4.3.3 B's Complement Subtraction. 4.3.4 B's Complement Overflow Detection. 4.3.5 Excess-E Addition and Subtraction. 4.3.6 Sign-Magnitude Addition and Subtraction. 4.4 Bibliography. 5. Arithmetic Operations: Multiplication. 5.1 Natural Numbers Multiplication. 5.1.1 Introduction. 5.1.2 Shift and Add Algorithms. 5.1.2.1 Shift and Add 1. 5.1.2.2 Shift and Add 2. 5.1.2.3 Extended Shift and Add Algorithm: XY t C t D. 5.1.2.4 Cellular Shift and Add. 5.1.3 Long-Operand Algorithm. 5.2 Integers. 5.2.1 B's Complement Multiplication. 5.2.1.1 Mod Bntm B's Complement Multiplication. 5.2.1.2 Signed Shift and Add. 5.2.1.3 Postcorrection B's Complement Multiplication. 5.2.2 Postcorrection 2's Complement Multiplication. 5.2.3 Booth Multiplication for Binary Numbers. 5.2.3.1 Booth-r Algorithms. 5.2.3.2 Per Gelosia Signed-Digit Algorithm. 5.2.4 Booth Multiplication for Base-B Numbers (Booth-r Algorithm in Base B). 5.3 Squaring. 5.3.1 Base-B Squaring. 5.3.1.1 Cellular Carry-Save Squaring Algorithm. 5.3.2 Base-2 Squaring. 5.4 Bibliography. 6 Arithmetic Operations: Division. 6.1 Natural Numbers. 6.2 Integers. 6.2.1 General Algorithm. 6.2.2 Restoring Division Algorithm. 6.2.3 Base-2 Nonrestoring Division Algorithm. 6.2.4 SRT Radix-2 Division. 6.2.5 SRT Radix-2 Division with Stored-Carry Encoding. 6.2.6 P-D Diagram. 6.2.7 SRT-4 Division. 6.2.8 Base-B Nonrestoring Division Algorithm. 6.3 Convergence (Functional Iteration) Algorithms. 6.3.1 Introduction. 6.3.2 Newton-Raphson Iteration Technique. 6.3.3 MacLaurin Expansion-Goldschmidt's Algorithm. 6.4 Bibliography. 7. Other Arithmetic Operations. 7.1 Base Conversion. 7.2 Residue Number System Conversion. 7.2.1 Introduction. 7.2.2 Base-B to RNS Conversion. 7.2.3 RNS to Base-B Conversion. 7.3 Logarithmic, Exponential, and Trigonometric Functions. 7.3.1 Taylor-MacLaurin Series. 7.3.2 Polynomial Approximation. 7.3.3 Logarithm and Exponential Functions Approximation by Convergence Methods. 7.3.3.1 Logarithm Function Approximation by Multiplicative Normalization. 7.3.3.2 Exponential Function Approximation by Additive Normalization. 7.3.4 Trigonometric Functions-CORDIC Algorithms. 7.4 Square Rooting. 7.4.1 Digit Recurrence Algorithm-Base-B Integers. 7.4.2 Restoring Binary Shift-and-Subtract Square Rooting Algorithm. 7.4.3 Nonrestoring Binary Add-and-Subtract Square Rooting Algorithm. 7.4.4 Convergence Method-Newton-Raphson. 7.5 Bibliography. 8. Finite Field Operations. 8.1 Operations in Zm. 8.1.1 Addition. 8.1.2 Subtraction. 8.1.3 Multiplication. 8.1.3.1 Multiply and Reduce. 8.1.3.2 Modified Shift-and-Add Algorithm. 8.1.3.3 Montgomery Multiplication. 8.1.3.4 Specific Ring. 8.1.4 Exponentiation. 8.2 Operations in GF(p). 8.3 Operations in Zp[x]/f (x). 8.3.1 Addition and Subtraction. 8.3.2 Multiplication. 8.4 Operations in GF(pn). 8.5 Bibliography. Appendix 8.1 Computation of fki. 9 Hardware Platforms. 9.1 Design Methods for Electronic Systems. 9.1.1 Basic Blocks of Integrated Systems. 9.1.2 Recurring Topics in Electronic Design. 9.1.2.1 Design Challenge: Optimizing Design Metrics. 9.1.2.2 Cost in Integrated Circuits. 9.1.2.3 Moore's Law. 9.1.2.4 Time-to-Market. 9.1.2.5 Performance Metric. 9.1.2.6 The Power Dimension. 9.2 Instruction Set Processors. 9.2.1 Microprocessors. 9.2.2 Microcontrollers. 9.2.3 Embedded Processors Everywhere. 9.2.4 Digital Signal Processors. 9.2.5 Application-Specific Instruction Set Processors. 9.2.6 Programming Instruction Set Processors. 9.3 ASIC Designs. 9.3.1 Full-Custom ASIC. 9.3.2 Semicustom ASIC. 9.3.2.1 Gate-Array ASIC. 9.3.2.2 Standard-Cell-Based ASIC. 9.3.3 Design Flow in ASIC. 9.4 Programmable Logic. 9.4.1 Programmable Logic Devices (PLDs). 9.4.2 Field Programmable Gate Array (FPGA). 9.4.2.1 Why FPGA? A Short Historical Survey. 9.4.2.2 Basic FPGA Concepts. 9.4.3 XilinxTM Specifics. 9.4.3.1 Configurable Logic Blocks (CLBs). 9.4.3.2 Input/Output Blocks (IOBs). 9.4.3.3 RAM Blocks. 9.4.3.4 Programmable Routing. 9.4.3.5 Arithmetic Resources in Xilinx FPGAs. 9.4.4 FPGA Generic Design Flow. 9.5 Hardware Description Languages (HDLs). 9.5.1 Today's and Tomorrow's HDLs. 9.6 Further Readings. 9.7 Bibliography. 10. Circuit Synthesis: General Principles. 10.1 Resources. 10.2 Precedence Relation and Scheduling. 10.3 Pipeline. 10.4 Self-Timed Circuits. 10.5 Bibliography. 11 Adders and Subtractors. 11.1 Natural Numbers. 11.1.1 Basic Adder (Ripple-Carry Adder). 11.1.2 Carry-Chain Adder. 11.1.3 Carry-Skip Adder. 11.1.4 Optimization of Carry-Skip Adders. 11.1.5 Base-Bs Adder. 11.1.6 Carry-Select Adder. 11.1.7 Optimization of Carry-Select Adders. 11.1.8 Carry-Lookahead Adders (CLAs). 11.1.9 Prefix Adders. 11.1.10 FPGA Implementation of Adders. 11.1.10.1 Carry-Chain Adders. 11.1.10.2 Carry-Skip Adders. 11.1.10.3 Experimental Results. 11.1.11 Long-Operand Adders. 11.1.12 Multioperand Adders. 11.1.12.1 Sequential Multioperand Adders. 11.1.12.2 Combinational Multioperand Adders. 11.1.12.3 Carry-Save Adders. 11.1.12.4 Parallel Counters. 11.1.13 Subtractors and Adder-Subtractors. 11.1.14 Termination Detection. 11.1.15 FPGA Implementation of the Termination Detection. 11.2 Integers. 11.2.1 B's Complement Adders and Subtractors. 11.2.2 Excess-E Adders and Subtractors. 11.2.3 Sign-Magnitude Adders and Subtractors. 11.3 Bibliography. 12 Multipliers. 12.1 Natural Numbers. 12.1.1 Basic Multiplier. 12.1.2 Sequential Multipliers. 12.1.3 Cellular Multiplier Arrays. 12.1.3.1 Ripple-Carry Multiplier. 12.1.3.2 Carry-Save Multiplier. 12.1.3.3 Figures of Merit. 12.1.4 Multipliers Based on Dissymmetric Br Bs Cells. 12.1.5 Multipliers Based on Multioperand Adders. 12.1.6 Per Gelosia Multiplication Arrays. 12.1.6.1 Introduction. 12.1.6.2 Adding Tree for Base-B Partial Products. 12.1.7 FPGA Implementation of Multipliers. 12.2 Integers. 12.2.1 B's Complement Multipliers. 12.2.2 Booth Multipliers. 12.2.2.1 Booth-1 Multiplier. 12.2.2.2 Booth-2 Multiplier. 12.2.2.3 Signed-Digit Multiplier. 12.2.3 FPGA Implementation of the Booth-1 Multiplier. 12.3 Bibliography. 13. Dividers. 13.1 Natural Numbers. 13.2 Integers. 13.2.1 Base-2 Nonrestoring Divider. 13.2.2 Base-B Nonrestoring Divider. 13.2.3 SRT Dividers. 13.2.3.1 SRT-2 Divider. 13.2.3.2 SRT-2 Divider with Carry-Save Computation of the Remainder. 13.2.3.3 FPGA Implementation of the Carry-Save SRT-2 Divider. 13.2.4 SRT-4 Divider. 13.2.5 Convergence Dividers. 13.2.5.1 Newton-Raphson Divider. 13.2.5.2 Goldschmidt Divider. 13.2.5.3 Comparative Data Between Newton-Raphson (NR) and Goldschmidt (G) Implementations. 13.3 Bibliography. 14 Other Arithmetic Operators. 14.1 Base Conversion. 14.1.1 General Base Conversion. 14.1.2 BCD to Binary Converter. 14.1.2.1 Nonrestoring 2p Subtracting Implementation. 14.1.2.2 Shift-and-Add BCD to Binary Converter. 14.1.3 Binary to BCD Converter. 14.1.4 Base-B to RNS Converter. 14.1.5 CRT RNS to Base-B Converter. 14.1.6 RNS to Mixed-Radix System Converter. 14.2 Polynomial Computation Circuits. 14.3 Logarithm Operator. 14.4 Exponential Operator. 14.5 Sine and Cosine Operators. 14.6 Square Rooters. 14.6.1 Restoring Shift-and-Subtract Square Rooter (Naturals). 14.6.2 Nonrestoring Shift-and-Subtract Square Rooter (Naturals). 14.6.3 Newton-Raphson Square Rooter (Naturals). 14.7 Bibliography. 15. Circuits for Finite Field Operations. 15.1 Operations in Zm. 15.1.1 Adders and Subtractors. 15.1.2 Multiplication. 15.1.2.1 Multiply and Reduce. 15.1.2.2 Shift and Add. 15.1.2.3 Montgomery Multiplication. 15.1.2.4 Modulo (Bk2c) Reduction. 15.1.2.5 Exponentiation. 15.2 Inversion in GF(p). 15.3 Operations in Zp[x]/f (x). 15.4 Inversion in GF(pn). 15.5 Bibliography. 16. Floating-Point Unit. 16.1 Floating-Point System Definition. 16.2 Arithmetic Operations. 16.2.1 Addition of Positive Numbers. 16.2.2 Difference of Positive Numbers. 16.2.3 Addition and Subtraction. 16.2.4 Multiplication. 16.2.5 Division. 16.2.6 Square Root. 16.3 Rounding Schemes. 16.4 Guard Digits. 16.5 Adder-Subtractor. 16.5.1 Alignment. 16.5.2 Additions. 16.5.3 Normalization. 16.5.4 Rounding. 16.6 Multiplier. 16.7 Divider. 16.8 Square Root. 16.9 Comments. 16.10 Bibliography. Index.
TL;DR: It is shown that probabilistic arithmetic can be used to compute the FFT in an extremely energy-efficient manner, yielding energy savings of over 5.6X in the context of the widely used synthetic aperture radar (SAR) application.
Abstract: Probabilistic arithmetic, where the ith output bit of addition and multiplication is correct with a probability pi , is shown to be a vehicle for realizing extremely energy-efficient, embedded computing. Specifically, probabilistic adders and multipliers, realized using elements such as gates that are in turn probabilistic, are shown to form a natural basis for primitives in the signal processing (DSP) domain. In this paper, we show that probabilistic arithmetic can be used to compute the FFT in an extremely energy-efficient manner, yielding energy savings of over 5. 6X in the context of the widely used synthetic aperture radar (SAR) application [1]. Our results are derived using novel probabilistic cmos (PC-MOS) technology, characterized and applied in the past to realize ultra-efficient architectures for probabilistic applications [2, 3, 4]. When applied to the dsp domain, the resulting error in the output of a probabilistic arithmetic primitive, such as an adder for example, manifests as degradation in the signal-to-noise ratio (SNR) ofthe sar image that is reconstructed through the FFT algorithm. In return for this degradation that is enabled by our probabilistic arithmetic primitives ?- degradation visually indistinguishable from an image reconstructed using conventional deterministic approaches -- significant energy savings and performance gains are shown to be possible per unit of SNR degradation. These savings stem from a novel method of voltage scaling, which we refer to as biased voltage scaling (or BIVOS), that is the major technical innovation on which our probabilistic designs are based.
TL;DR: This paper presents two vector-level software algorithms which essentially eliminate bit-wise inner product operations for Gaussian normal bases and shows that the software implementation of the proposed algorithm is faster than previously reported normal basis multiplication algorithms.
Abstract: Recently, implementations of normal basis multiplication over the extended binary field GF(2/sup m/) have received considerable attention. A class of low complexity normal bases called Gaussian normal bases has been included in a number of standards, such as IEEE and NIST for an elliptic curve digital signature algorithm. The multiplication algorithms presented there are slow in software since they rely on bit-wise inner product operations. In this paper, we present two vector-level software algorithms which essentially eliminate such bit-wise operations for Gaussian normal bases. Our analysis and timing results show that the software implementation of the proposed algorithm is faster than previously reported normal basis multiplication algorithms. The proposed algorithm is also more memory efficient compared with its look-up table-based counterpart. Moreover, two new digit-level multiplier architectures are proposed and it is shown that they outperform the existing normal basis multiplier structures. As compared with similar digit-level normal basis multipliers, the proposed multiplier with serial output requires the fewest number of XOR gates and the one with parallel output is the fastest multiplier.
TL;DR: An algorithm to achieve fast multiplication in two's complement representation is presented, which results in a true diamond-shape for the partial product tree, which is more efficient in terms of implementation.
Abstract: The performance of multiplication is crucial for multimedia applications such as 3D graphics and signal processing systems, which depend on the execution of large numbers of multiplications. Previously reported algorithms mainly focused on rapidly reducing the partial products rows down to final sums and carries used for the final accumulation. These techniques mostly rely on circuit optimization and minimization of the critical paths. In this paper, an algorithm to achieve fast multiplication in two's complement representation is presented. Rather than focusing on reducing the partial products rows down to final sums and carries, our approach strives to generate fewer partial products rows. In turn, this influences the speed of the multiplication, even before applying partial products reduction techniques. Fewer partial products rows are produced, thereby lowering the overall operation time. In addition to the speed improvement, our algorithm results in a true diamond-shape for the partial product tree, which is more efficient in terms of implementation. The synthesis results of our multiplication algorithm using the Artisan TSMC 0.13mum 1.2-volt standard-cell library show 13 percent improvement in speed and 14 percent improvement in power savings for 8-bit times 8-bit multiplications (10 percent and 3 percent, respectively, for 16-bit times 16-bit multiplications) when compared to conventional multiplication algorithms
TL;DR: Finite word-length simulations demonstrate the viability and excellent performance of NEDA, a new DA architecture aimed at reducing the cost metrics of power and area while maintaining high speed and accuracy in digital signal processing (DSP) applications.
Abstract: Conventional distributed arithmetic (DA) is popular in application-specific integrated circuit (ASIC) design, and it features on-chip ROM to achieve high speed and regularity. In this paper, a new DA architecture called NEDA is proposed, aimed at reducing the cost metrics of power and area while maintaining high speed and accuracy in digital signal processing (DSP) applications. Mathematical analysis proves that DA can implement inner product of vectors in the form of two's complement numbers using only additions, followed by a small number of shifts at the final stage. Comparative studies show that NEDA outperforms widely used approaches such as multiply/accumulate (MAC) and DA in many aspects. Being a high-speed architecture free of ROM, multiplication, and subtraction, NEDA can also expose the redundancy existing in the adder array consisting of entries of 0 and 1. A hardware compression scheme is introduced to generate a butterfly structure with minimum number of additions. NEDA-based architectures for 8 /spl times/ 8 discrete cosine transform (DCT) core are presented as an example. Savings exceeding 88% are achieved, when the compression scheme is applied along with NEDA. Finite word-length simulations demonstrate the viability and excellent performance of NEDA.
TL;DR: New architectures to detect erroneous outputs caused by certain types of faults in bit-parallel and bit-serial polynomial basis multipliers over finite fields of characteristic two are proposed.
Abstract: In many cryptographic schemes, the most time consuming basic arithmetic operation is the finite field multiplication and its hardware implementation for bit parallel operation may require millions of logic gates. Some of these gates may become faulty in the field due to natural causes or malicious attacks, which may lead to the generation of erroneous outputs by the multiplier. In this paper, we propose new architectures to detect erroneous outputs caused by certain types of faults in bit-parallel and bit-serial polynomial basis multipliers over finite fields of characteristic two. In particular, parity prediction schemes are developed for detecting errors due to single and certain multiple stuck-at faults. Although the issue of detecting soft errors in registers is not considered, the proposed schemes have the advantage that they can be used with any irreducible binary polynomial chosen to define the finite field
TL;DR: Part I Approaching, Organizing and Designing Instruction General Introduction to Part II Number Words and Numerals Assessment Task Groups Forward Number Word Sequences Number Word After Backward Number word Sequences number Word Before Numeral Identification Numeral Recognition Sequencing Numeral Ordering NumerALS Locating Numbers in the Range 1 To 100 Instructional Activities.
Abstract: PART I Approaching, Organizing and Designing Instruction General Introduction to Part II PART II Number Words and Numerals Assessment Task Groups Forward Number Word Sequences Number Word After Backward Number Word Sequences Number Word Before Numeral Identification Numeral Recognition Sequencing Numerals Ordering Numerals Locating Numbers in the Range 1 To 100 Instructional Activities Count Around Numbers on the Line Stand In Line Secret Numbers Can You See Me? Make and Break Numbers The Joke Is On You Counting Choir Take Your Place What Comes Next? Early Counting and Addition Assessment Task Groups Comparing Small Collections Increase and Decrease in the Range 1 To 6 Establishing the Numerosity of a Collection Establishing a Collection of Specified Numerosity Establishing the Numerosity of Two Collections Additive Tasks Involving Two Screened Collections Counting and Copying Temporal Sequences and Temporal Patterns Instructional Activities Domino Addition Addition Dice Counters in a Row On the Mat Where Do I Go? Toy Box Teddy Bear Walk Chains Give Me Five Pass It On Structuring Numbers 1 To 10 Assessment Task Groups Making Finger Patterns for Numbers in the Range 1 To 5 Making Finger Patterns for Numbers in the Range 6 To 10 Naming and Visualizing Domino Patterns 1 To 6 Naming And Visualizing Pair-Wise Patterns On a Ten-Frame Naming And Visualizing Five-Wise Patterns On a Ten-Frame Partitions of 5 and 10 Addition and Subtraction in the Range 1 to 10 Instructional Activities Bunny Ears The Great Race Quick Dots Make Five Concentration Five and Ten Frame Flashes Memory Game Domino Flashes Domino Fish Domino Snap Make Ten Fish Advanced Counting, Addition and Subtraction Assessment Task Groups Additive Tasks Involving Two Screened Collections Missing Addend Task Involving Two Screened Collections Removed Items Task Involving a Screened Collection Missing Subtrahend Task Involving a Screened Collection Comparative Subtraction Involving Two Screened Collections Subtraction with Bare Numbers Instructional Activities Calculator Counting Class Count On and Count-Back One Hundred Square Activities Activities on A Bead Bar or Bead String Bucket Count-On Bucket Count-Back Number Line Count-On Numeral Track Activities Under the Cloth Structuring Numbers 1 To 20 Assessment Task Groups Naming and Visualizing Pair-Wise Patterns For 1 To 10 Naming and Visualizing Five-Wise Patterns for 1-10 Naming and Visualizing Pair-Wise Patterns For 11 To 20 Naming and Visualizing Five-Wise and Ten-Wise Patterns For 11 To 14 Naming and Visualizing Ten-Wise Patterns For 15 To 20 Addition Using Doubles, Fives and Tens - Addends Less Than 11 Subtraction Using Doubles, Fives and Tens - Subtrahend and Difference Less Than 11 Addition Using Doubles, Fives and Tens - One Addend Greater Than 10 Subtraction Using Doubles, Fives and Tens - Subtrahend or Difference Greater Than 10 Instructional Activities Double Decker Bus Flashes Getting On and Off the Bus Bus Snap Make Combinations to Twenty Fish Using Ten Plus Combinations Five and Ten Game Chocolate Boxes Double Ten Frame Facts Bead Board Two-Digit Addition and Subtraction: Jump Strategies Assessment Task Groups Forward and Backward Number Word Sequences by 10s, On and Off the Decade Adding From a Decade and Subtracting To a Decade Adding To a Decade and Subtracting From a Decade Incrementing and Decrementing by 10s on and Off the Decade Incrementing Flexibly By10s and Ones Adding 10s to a 2-Digit Number and Subtracting 10s from a 2-Digit Number Adding Two 2-Digit Numbers Without and With Regrouping Subtraction Involving Two 2-Digit Numbers Without and With Regrouping Addition and Subtraction Using Transforming, Compensating and Other Strategies Instructional Activities Leap Frog Bead String with Ten Catcher Add or Subtract 11 Add To or Subtract From 49 Calculator Challenge Jump To 100 Jump From 100 Target Number Walk-About Sequences Non Standard Measurement Plan Two-Digit Addition and Subtraction: Split Strategies Assessment Task Groups Higher Decade Addition and Subtraction Without and With Bridging the Decade Partitioning and Combining Involving2-Digit Numbers Combining and Partitioning Involving Non-Canonical Forms Addition Involving Two 2-Digit Numbers Without and With Regrouping Subtraction Involving Two 2-Digit Numbers Without and With Regrouping Instructional Activities Follow the Pattern Ten More or Ten Less Counting By Tens Add or Subtract Tens Adding Tens and Ones Using Money Screened Subtraction Task Split the Subtrahend (Multiples of 10) Early Multiplication and Division Assessment Task Groups Counting By 2s, 5s, 10s And 3s Repeated Equal Groups -Visible Repeated Equal Groups -Items Screened and Groups Visible Repeated Equal Groups -Groups Screened and Items Screened Multiplication and Division Using Arrays Word Problems Relational Thinking Using Bare Number Problems Instructional Activities Count around - Multiples Trios for Multiples Quick Draw Multiples Rolling Groups Lemonade Stand Array Flip Duelling Arrays Mini Melton Four's A Winner I Have... Who Has...? PART III The Teacher as a Learner Glossary
TL;DR: A scheme for robust multi-precision arithmetic over the positive integers, protected by a novel family of non-linear arithmetic residue codes, providing an upper bound on the number of undetectable errors is presented.
Abstract: We present a scheme for robust multi-precision arithmetic over the positive integers, protected by a novel family of non-linear arithmetic residue codes. These codes have a very high probability of detecting arbitrary errors of any weight. Our scheme lends itself well for straightforward implementation of standard modular multiplication techniques, i.e. Montgomery or Barrett Multiplication, secure against active fault injection attacks. Due to the non-linearity of the code the probability of detecting an error does not only depend on the error pattern, but also on the data. Since the latter is not usually known to the adversary a priori, a successful injection of an undetected error is highly unlikely. We give a proof of the robustness of these codes by providing an upper bound on the number of undetectable errors.
TL;DR: The resulting scalar multiplication method is compared to standard methods for Koblitz curves, and is faster than any given method with similar storage requirements already on the curve K-163, with larger improvements as the size of the curve increases.
Abstract: It has been recently acknowledged [4,6,9] that the use of double bases representations of scalars n, that is an expression of the form n = ∑e, s, t (–1)eAsBt can speed up significantly scalar multiplication on those elliptic curves where multiplication by one base (say B) is fast. This is the case in particular of Koblitz curves and supersingular curves, where scalar multiplication can now be achieved in o(logn) curve additions.
Previous literature dealt basically with supersingular curves (in characteristic 3, although the methods can be easily extended to arbitrary characteristic), where A,B ∈ℕ. Only [4] attempted to provide a similar method for Koblitz curves, where at least one base must be non-real, although their method does not seem practical for cryptographic sizes (it is only asymptotic), since the constants involved are too large.
We provide here a unifying theory by proposing an alternate recoding algorithm which works in all cases with optimal constants. Furthermore, it can also solve the until now untreatable case where both A and B are non-real. The resulting scalar multiplication method is then compared to standard methods for Koblitz curves. It runs in less than logn/loglogn elliptic curve additions, and is faster than any given method with similar storage requirements already on the curve K-163, with larger improvements as the size of the curve increases, surpassing 50% with respect to the τ-NAF for the curves K-409 and K-571. With respect of windowed methods, that can approach our speed but require O(log(n)/loglog(n)) precomputations for optimal parameters, we offer the advantage of a fixed, small memory footprint, as we need storage for at most two additional points.
TL;DR: This paper investigates the standard modular multiplication, the Montgomery multiplication, and the matrix–vector multiplication techniques and describes, analyze and compare various $GF(2^m)$ multipliers.
Abstract: In this paper, we describe, analyze and compare various \(GF(2^m)\) multipliers. Particularly, we investigate the standard modular multiplication, the Montgomery multiplication, and the matrix–vector multiplication techniques.
TL;DR: The polar representation of complex numbers is extended to complex polar intervals or sectors; detailed algorithms are derived for performing basic arithmetic operations on sectors and it is shown that in many applications the polar representation is more advisable.
Abstract: In this paper, the polar representation of complex numbers is extended to complex polar intervals or sectors; detailed algorithms are derived for performing basic arithmetic operations on sectors. While multiplication and division are exactly defined, addition and subtraction are not, and we seek to minimize the pessimism introduced by these operations. Addition is studied as an optimization problem which is analytically solved. The complex interval arithmetic thus defined is illustrated with some numerical examples which show that in many applications, the polar representation is more advisable.
TL;DR: A case study of HW design with the proposed architecture shows that EC point multiplication over GF(p) and GF(2m) can be improved by a factor of 1.6 compared to the case of using single processing element.
Abstract: We propose a parallel processing crypto-processor for Elliptic Curve Cryptography (ECC) to speed up EC point multiplication. The processor consists of a controller that dynamically checks instruction-level parallelism (ILP) and multiple sets of modular arithmetic logic units accelerating modular operations. A case study of HW design with the proposed architecture shows that EC point multiplication over GF(p) and GF(2m) can be improved by a factor of 1.6 compared to the case of using single processing element.
TL;DR: This report collects the flop count expressions for both real and complex kernels and also presents brief outlines of the derivations for the flip count expressions.
Abstract: : In the course of designing or evaluating signal processing algorithms, one often must determine the computational workload needed to implement the algorithms on a digital computer. The floating-point operation (flop) counts for real versions of the most common signal processing kernels are well documented. However, the flop counts for kernels operating on complex inputs are not as readily found. This report collects the flop count expressions for both real and complex kernels and also presents brief outlines of the derivations for the flop count expressions. Specifically, the following computational kernels are addressed: (1) the dimensions of the two multiplicands (m x n and n x p) for the matrix-matrix multiplication; (2) the length of the vector n for the fast Fourier transform; (3) the size of the triangular system n for forward and back substitutions; (4) the dimensions of the input matrix m x n for the Householder QR decomposition, eigenvalue decomposition, and singular value decomposition.
TL;DR: In this paper, the authors proposed an algorithm for scalar multiplication on supersingular and Koblitz curves with O(log n/ log log n) precomputations.
Abstract: It has been recently acknowledged [4,6,9] that the use of double bases representations of scalars n, that is an expression of the form n = Σ e,s,t (-1) e A s B t can speed up significantly scalar multiplication on those elliptic curves where multiplication by one base (say B) is fast. This is the case in particular of Koblitz curves and supersingular curves, where scalar multiplication can now be achieved in o(logn) curve additions. Previous literature dealt basically with supersingular curves (in characteristic 3, although the methods can be easily extended to arbitrary characteristic), where A, B ∈ N. Only [4] attempted to provide a similar method for Koblitz curves, where at least one base must be non-real, although their method does not seem practical for cryptographic sizes (it is only asymptotic), since the constants involved are too large. We provide here a unifying theory by proposing an alternate recoding algorithm which works in all cases with optimal constants. Furthermore, it can also solve the until now untreatable case where both A and B are non-real. The resulting scalar multiplication method is then compared to standard methods for Koblitz curves. It runs in less than log n/ log log n elliptic curve additions, and is faster than any given method with similar storage requirements already on the curve K-163, with larger improvements as the size of the curve increases, surpassing 50% with respect to the T-NAF for the curves K-409 and K-571. With respect of windowed methods, that can approach our speed but require 0(log(n)/ log log(n)) precomputations for optimal parameters, we offer the advantage of a fixed, small memory footprint, as we need storage for at most two additional points.
TL;DR: In this paper, the authors describe algorithms for point multiplication on Koblitz curves using multiple-base expansions of the form k = ∑±τa (τ-1)b and k= ∑ ±τa(τ)-1b (τ2 − τ-1b)c, and prove that the number of terms in the second type is sublinear in the bit length of k.
Abstract: We describe algorithms for point multiplication on Koblitz curves using multiple-base expansions of the form k = ∑±τa (τ–1)b and k= ∑±τa (τ–1)b (τ2 – τ– 1)c We prove that the number of terms in the second type is sublinear in the bit length of k, which leads to the first provably sublinear point multiplication algorithm on Koblitz curves For the first type, we conjecture that the number of terms is sublinear and provide numerical evidence demonstrating that the number of terms is significantly less than that of τ-adic non-adjacent form expansions We present details of an innovative FPGA implementation of our algorithm and performance data demonstrating the efficiency of our method
TL;DR: It is proved that the number of terms in the second type is sublinear in the bit length of k, which leads to the first provably sublinear point multiplication algorithm on Koblitz curves.
Abstract: We describe algorithms for point multiplication on Koblitz curves using multiple-base expansions of the form k = Σ ±τ a (τ - 1) b and k = Σ ±τ a (τ-1) b (τ 2 - τ - 1) c . We prove that the number of terms in the second type is sublinear in the bit length of k, which leads to the first provably sublinear point multiplication algorithm on Koblitz curves. For the first type, we conjecture that the number of terms is sublinear and provide numerical evidence demonstrating that the number of terms is significantly less than that of r-adic non-adjacent form expansions. We present details of an innovative FPGA implementation of our algorithm and performance data demonstrating the efficiency of our method.
TL;DR: A low-power, area-efficient four-way 32-bit multifunction arithmetic unit has been developed for programmable shaders for handheld 3D graphics systems and unified into a single arithmetic platform with maximum four-cycle latency and single-cycle throughput.
Abstract: A low-power, area-efficient 128-bit multifunction arithmetic unit has been developed for programmable shaders for handheld 3-D graphics systems. It adopts the logarithmic number system (LNS) at the arithmetic core for the single cycle throughput and the small-size low-power unification of various complex arithmetic operations such as power, logarithm, trigonometric functions, vector multiplication, division, square root and inner product. An uneven 24-piecewise logarithmic conversion scheme is proposed with 0.8% of maximum conversion error. A 93K gate test chip is fabricated with 0.18-?m CMOS technology. It operates at 210MHz with 15.3mW power consumption at 1.8V.
TL;DR: There is some evidence that the traditional third-grade curriculum and instruction emphasizing memorization of multiplication facts produces much less understanding of the basic concepts of multiplication than a standards-based curriculum and Instruction emphasizing construction of number sense and meaning for operations.
Abstract: This article summarizes the basic concepts of multiplication and provides some evidence that the traditional third-grade curriculum and instruction emphasizing memorization of multiplication facts produces much less understanding of the basic concepts of multiplication than a standards-based curriculum and instruction emphasizing construction of number sense and meaning for operations. This study also describes a collection of assessment tasks that provided meaningful evidence of children's understandings of basic multiplication concepts, including understandings of the relationships between multiplication and addition.
TL;DR: This work investigates the area, speed, power trade-offs for implementation of FIR filters using MCM and digit-serial arithmetic, and introduces an algorithm for reducing both the number of adders and subtracters as well as thenumber of shifts.
Abstract: Multiple constant multiplication (MCM) is an efficient way of implementing several constant multiplications with the same input data. The coefficients are expressed using shifts, adders, and subtracters. By utilizing redundancy between the coefficients the number of adders and subtracters is reduced resulting in a low complexity implementation. However, for digit-serial arithmetic a shift requires a flip-flop, and, hence, the number of shifts should be taken into consideration as well. In this work we investigate the area, speed, power trade-offs for implementation of FIR filters using MCM and digit-serial arithmetic. We also introduce an algorithm for reducing both the number of adders and subtracters as well as the number of shifts.
TL;DR: The proposed truncated multiplication scheme has been synthesized on an FPGA platform and gives a better accuracy over area ratio than previous well-known schemes such as the constant correcting and variable correcting truncation schemes (CCT and VCT).
Abstract: This paper presents an error compensation method for truncated multiplication. From two n-bit operands, the operator produces an n-bit product with small error compared to the 2n-bit exact product. The method is based on a logical computation followed by a simplification process. The filtering parameter used in the simplification process helps to control the trade-off between hardware cost and accuracy. The proposed truncated multiplication scheme has been synthesized on an FPGA platform. It gives a better accuracy over area ratio than previous well-known schemes such as the constant correcting and variable correcting truncation schemes (CCT and VCT).
TL;DR: A new fast and secure point multiplication algorithm based on a particular kind of addition chains involving only additions (no doubling), providing a natural protection against side channel attacks is proposed.
Abstract: In this paper, we propose a new fast and secure point multiplication algorithm. It is based on a particular kind of addition chains involving only additions (no doubling), providing a natural protection against side channel attacks. Moreover, we propose new addition formulae that take into account the specific structure of those chains making point multiplication very efficient.
TL;DR: In this paper, a system and methods configured for recoding an odd integer and elliptic curve point multiplication are disclosed, having general utility and also specific application to ECC and cryptosystems.
Abstract: Systems and methods configured for recoding an odd integer and elliptic curve point multiplication are disclosed, having general utility and also specific application to elliptic curve point multiplication and cryptosystems. In one implementation, the recoding is performed by converting an odd integer k into a binary representation. The binary representation could be, for example, coefficients for powers of two representing the odd integer. The binary representation is then configured as comb bit-columns, wherein every bit-column is a signed odd integer. Another implementation applies this recoding method and discloses a variation of comb methods that computes elliptic curve point multiplication more efficiently and with less saved points than known comb methods. The disclosed point multiplication methods are then modified to be Simple Power Analysis (SPA)-resistant.
TL;DR: New software algorithms for efficient multiplication over F2m that use a Gaussian normal basis representation are presented and it is concluded that the penalty in multiplication is still sufficiently large to discourage the use of normal bases in software implementations of elliptic curve systems.
Abstract: Fast algorithms for multiplication in finite fields are required for several cryptographic applications, in particular for implementing elliptic curve operations over binary fields F2m. In this paper, we present new software algorithms for efficient multiplication over F2m that use a Gaussian normal basis representation. Two approaches are presented, direct normal basis multiplication and a method that exploits a mapping to a ring where fast polynomial-based techniques can be employed. Our analysis, including experimental results on an Intel Pentium family processor, shows that the new algorithms are faster and can use memory more efficiently than previous methods. Despite significant improvements, we conclude that the penalty in multiplication is still sufficiently large to discourage the use of normal bases in software implementations of elliptic curve systems
TL;DR: The method of four Russians for Inversion (M4RI) algorithm as mentioned in this paper is a fast algorithm for solving a dense linear system of boolean equations, which is the final step of several cryptanalytic attacks.
Abstract: Solving a dense linear system of boolean equations is the final step of several cryptanalytic attacks. Examples include stream cipher cryptanalysis via XL and related algorithms, integer factorization, and attacks on the HFE public-key cryptosystem. While both Gaussian Elimination and Strassen’s Algorithm have been proposed as methods, this paper specifies an algorithm that is much faster than both in practice. Performance is formally modeled, and experimental running times are provided, including for the optimal setting of the algorithm’s parameter. The consequences for published attacks on systems are also provided. The algorithm is named Method of Four Russians for Inversion (M4RI), in honor of the matrix multiplication algorithm from which it emerged, the Method of Four Russians Multiplication (M4RM).
TL;DR: In this paper, an efficient architecture for a FPGA symmetry FIR filter that employs M-bit parallel-distributed arithmetic (M-bit PDA) is proposed, where the partial product is pre-calculated and saved into the distributed RAM.
Abstract: An efficient architecture for a FPGA symmetry FIR filter is proposed that employs M-bit parallel-distributed arithmetic (M-bit PDA). The partial product is pre-calculated and saved into the distributed RAM. This eliminates the large amount of logic needed to compute multiplication results. The proposed architecture consumes less area and offers higher speed operation because the multiplier is omitted. Altera APEX20KE is used as a target device. Thus, the proposed architecture has high processing speed and small area.
TL;DR: This paper presents power models for multiplication and addition components on FPGAs which can be used at a high-level design description stage to estimate their logic and intra-component routing power consumption.
Abstract: This paper presents power models for multiplication and addition components on FPGAs which can be used at a high-level design description stage to estimate their logic and intra-component routing power consumption. The models presented are parameterized by the word-length of the component and the word-level statistics of its input signals. A key feature of these power models is the ability to handle both zero mean and non-zero mean signals. A method for measuring intra-component routing power consumption is presented, enabling the power models to account for both logic and routing power in components. The resulting models are equations which can be used to estimate the power consumed in an arithmetic component in a fraction of a second at the pre-placement stage of the design flow. The models have a mean relative error of 7.2% compared to bit-level power simulation of the placed-and-routed design.
TL;DR: In this article, the authors give characterizations of faithful multiplication Dedekind modules and derive a number of their properties, including commutativity with identity and unitality with identity.
Abstract: All rings are commutative with identity and all modules are unital. We give several properties of invertible submodules of multiplication modules generalizing those of invertible ideals. We give characterizations of faithful multiplication Dedekind modules and derive a number of their properties.
TL;DR: Novel word-level algorithms and implementations for the underlying GF(2m) multiplication and squaring arithmetic which enable improved flexibility versus performance tradeoffs, are presented and employed in the design of an efficient flexible ECP architecture.
Abstract: The design of flexible elliptic curve cryptography processors (ECP) is considered in this paper. Novel word-level algorithms and implementations for the underlying GF(2m) multiplication and squaring arithmetic which enable improved flexibility versus performance tradeoffs, are presented and employed in the design of an efficient flexible ECP architecture; corresponding field-programmable gate-array (FPGA) prototyping results for two different processor word lengths are also included for evaluation
TL;DR: The trade-off between adders/subtracters and registers is discussed, and implementation results for area, speed, and power for different realizations are presented.
Abstract: Multiple constant multiplication (MCM), i.e., realizing a number of constant multiplications using a minimum number of adders and subtracters, has been an active research area for the last decade. Almost all work has been focused on single rate FIR filters. However, for polyphase interpolation and decimation FIR filters there are two different implementation alternatives. For interpolation, direct form subfilters lead to fewer registers as they can be shared among the subfilters. The arithmetic part corresponds to a matrix vector multiplication. Using transposed direct form subfilters, the registers can not be shared, while the arithmetic part has the same input to all coefficients, and, hence, the redundancy between the coefficients is expected to be higher. For decimation filters the opposite holds for direct form and transposed direct form subfilters. In this work we discuss the trade-off between adders/subtracters and registers, and present implementation results for area, speed, and power for different realizations.