TL;DR: In this article, the relative proportion of plastic work dissipated in fracture and friction is a simple function of stress ratio, and the normality principle is applied to generate a new family of yield loci.
TL;DR: It is concluded that the new rounding algorithm is the fastest rounding algorithm, provided that an injection can be added in during the reduction of the partial products into a carry-save encoded digit string.
Abstract: A new IEEE compliant floating-point rounding algorithm for computing the rounded product from a carry-save representation of the product is presented. The new rounding algorithm is compared with the rounding algorithms of Yu and Zyner (1995) and of Quach et al. (1991). For each rounding algorithm, a logical description and a block diagram is given, the correctness is proven, and the latency is analyzed. We conclude that the new rounding algorithm is the fastest rounding algorithm, provided that an injection (which depends only on the rounding mode and the sign) can be added in during the reduction of the partial products into a carry-save encoded digit string. In double precision format, the latency of the new rounding algorithm is 12 logic levels compared to 14 logic levels in the algorithm of Quach et al. and 16 logic levels in the algorithm of Yu and Zyner.
TL;DR: This study reports an investigation of ten-year-old children's strategy use in computational estimation (i.e., give an approximate answer like 400 to an arithmetic problem like 224 + 213) and concluded that children chose strategies in an adaptive way so as to obtain fast and accurate performance.
Abstract: This study reports an investigation of ten-year-old children's strategy use in computational estimation (i.e., give an approximate answer like 400 to an arithmetic problem like 224 + 213). Children used four strategies: rounding with decomposition, rounding without decomposition, truncation, and compensation. Strategies appeared to differ in frequency and effectiveness. Finally, children chose strategies in an adaptive way so as to obtain fast and accurate performance. Implications of these findings for understanding children's computational estimation performance and strategies in numerical cognition in general are discussed.
TL;DR: In this paper, a method for generating a number audio element for playing a desired number in an audio system is presented, which is based on the idea of exact matching. But the method requires the number audio elements to be stored in a plurality of audio elements representing a subset of the range of numbers, and the exact match types used to determine if one or more matching audio elements exists in the subset of numbers.
Abstract: A method and apparatus for generating a number audio element for playing a desired number in an audio system. Specifically, the method sets forth the steps of storing a plurality of audio elements used to represent a subset of the range of numbers; defining a plurality of match types used to determine if one or more matching audio element exists in the subset of the range of numbers; defining a plurality of accuracy prefixes representative of the error associated with any rounding of the desired number to be played; setting the accuracy prefix to a value representing an exact match between the desired number and a number audio element in the stored subset of audio elements representative of the range of numbers; filtering the audio elements to determine if an exact match exists; if an exact match does not exist, rounding the desired number to a pre-determined level of precision to create an estimated desired number; setting the accuracy prefix to a value representing the error associated with any rounding of the desired number to be played; filtering the audio elements to determine if an exact match exists between the estimated desired number and any of the plurality of audio elements used to represent a subset of the range of numbers; and repeating the steps of filtering until such time as an exact match has been determined between the estimated desired number and any of the plurality of audio elements used to represent a subset of the range of numbers. Once an exact match is determined, the number audio element is transmitted to a remote user. The number audio element may be a stock quote or an announcement of the time. Further, the number audio element may be transmitted in telephone systems, automated teller machines, or other audio systems.
TL;DR: In this paper, the Lanczos basis is used for the solution of symmetric indefinite linear systems, by solving a reduced system in one way or another, and it is shown that the method of solution may lead, under certain circumstances, to large additional errors, which are not corrected by continuing the iteration process.
Abstract: The three-term Lanczos process for a symmetric matrix leads to bases for Krylov subspaces of increasing dimension. The Lanczos basis, together with the recurrence coefficients, can be used for the solution of symmetric indefinite linear systems, by solving a reduced system in one way or another. This leads to well-known methods: MINRES (minimal residual), GMRES (generalized minimal residual), and SYMMLQ (symmetric LQ). We will discuss in what way and to what extent these approaches differ in their sensitivity to rounding errors.
In our analysis we will assume that the Lanczos basis is generated in exactly the same way for the different methods, and we will not consider the errors in the Lanczos process itself. We will show that the method of solution may lead, under certain circumstances, to large additional errors, which are not corrected by continuing the iteration process.
Our findings are supported and illustrated by numerical examples.
TL;DR: In this article, a functional unit in a digital system is provided with a rounding Multiplication instruction, wherein a most significant product of first pair of elements is combined with a least significant product (e.g., the product of the most significant products of the first pair and the second pair) and the combined product is rounded, and the final result is stored in a destination.
Abstract: A functional unit in a digital system is provided with a rounding Multiplication instruction, wherein a most significant product of first pair of elements is combined with a least significant product of a second pair of elements, the combined product is rounded, and the final result is stored in a destination. Rounding is performed by adding a rounding value to form an intermediate result, and then shifting the intermediate result right. A combined result is rounded to a fixed length shorter than the combined product.
TL;DR: The packet forwarding format and the proposed algorithms constitute a new paradigm for handling data hazards in deeply pipelined floating-point pipelines and the effective latency of the proposed design is two cycles for successive dependent operations while perceiving IEEE 754 binary floating- point compatibility.
Abstract: This paper presents a floating-point addition algorithm and adder pipeline design employing a packet forwarding pipeline paradigm. The packet forwarding format and the proposed algorithms constitute a new paradigm for handling data hazards in deeply pipelined floating-point pipelines. The addition and rounding algorithms employ a four stage execution phase pipeline with each stage suitable for implementation in a short clock period, assuming about 15 logic levels per cycle. The first two cycles are related to addition proper and are the focus of this paper. The last two cycles perform the rounding and have been covered in a paper by D.W. Matula and A.M. Nielsen (1997). The addition algorithm accepts one operand in a standard binary floating-point formal at the start of cycle one. The second operand is represented in the packet forwarding floating-point format: namely, it is divided into four parts: the sign bit, the exponent string, the principal part of the significant, and the carry-round packet. The first three parts of the second operand are input at the start of cycle one and the carry-round packet is input at the start of cycle two. The result is output in two formats that both represent the rounded result as required by the IEEE 754 standard. The result is output in the packet forwarding floating-point format at the end of cycles two and three to allow forwarding with an effective latency of two cycles. The result is also format at the end of cycle four for retirement to a register. The packet forwarding result is thus available with an effective two cycle latency for forwarding to the start of the adder pipeline or to a cooperating multiplier pipeline accepting a packet forwarding operand. The effective latency of the proposed design is two cycles for successive dependent operations while perceiving IEEE 754 binary floating-point compatibility.
TL;DR: This work analyzes several accurate summation methods and finds that two methods are particularly effective to improve reproducibility and stability of large scale scientific simulations, especially climate modeling, on distributed memory parallel computers: Kahan's self-compensated summation and Bailey's double-double precision summation.
Abstract: Numerical reproducibility and stability of large scale scientific simulations, especially climate modeling, on distributed memory parallel computers are becoming critical issues. In particular, global summation of distributed arrays is most susceptible to rounding errors, and their propagation and accumulation cause uncertainty in final simulation results. We analyzed several accurate summation methods and found that two methods are particularly effective to improve (ensure) reproducibility and stability: Kahan's self-compensated summation and Bailey's double-double precision summation. We provide an MPI operator MPLSUMDD to work with MPI collective operations to ensure a scalable implementation on large number of processors. The final methods are particularly simple to adopt in practical codes.
TL;DR: Shortest path rounding trades accuracy for space and time and eliminates the exponential cost introduced by cascading in geometric rounding, and yields a practical solution to numerical issues in computational geometry.
Abstract: Exact implementations of algorithms of computational geometry are subject to exponential growth in running time and space. In particular, coordinate bit-complexity can grow exponentially when algorithms are cascaded : the output of one algorithm becomes the input to the next. Cascading is a significant problem in practice. We propose a geometric rounding technique: shortest path rounding . Shortest path rounding trades accuracy for space and time and eliminates the exponential cost introduced by cascading. It can be applied to all algorithms which operate on planar polygonal regions, for example, set operations, transformations, convex hull, triangulation, and Minkowski sum. Unlike other geometric rounding techniques, shortest path rounding can round vertices to arbitrary lattices, even in polar coordinates, as long as the rounding cells are connected. (Other rounding techniques can only round to the integer grid.) On the integer grid, shortest path rounding introduces less combinatorial change and geometric error than the other rounding methods. Three algorithms are given for shortest path rounding, one of which we have used in industrial application software since 1992. In combination with recent advances in exact floating point evaluation of numerical primitives, shortest path geometric rounding yields a practical solution to numerical issues in computational geometry. Geometric algorithms can be implemented exactly on floating point input coordinates; the exact output coordinates can be rounded to accurate floating point approximations; and the cost of each arithmetic operation is only a little more than if it were implemented as a single hardware floating point operation.
TL;DR: The usefulness of injection-based rounding is demonstrated in a design of an IEEE floating-point multiplier capable of performing either a double-precision multiplication or a single- Precision multiplication.
TL;DR: It is proved that it is NP-hard to compute an approximate solution with approximation ratio smaller than 2 with respect to the unweighted l∞-distance associated with the family W2 of all 2 × 2 square regions.
Abstract: In this paper, we discuss the problem of computing an optimal rounding of a real sequence (resp. matrix) into an integral sequence (resp. matrix). Our criterion of the optimality is to minimize the weighted l∞-distance Dist∞F, w (A, B) between an input sequence (resp. matrix) A and the output B. The distance is dependent on a family F of intervals (resp. rectangular regions) for the sequence rounding (resp. matrix rounding) and positive-valued weight function w on the family. We give efficient polynomial-time algorithms for the sequence-rounding problem for weighted l∞-distance with respect to any weight function w and any family F of intervals. For the matrix-rounding problem, we prove that it is NP-hard to compute an approximate solution with approximation ratio smaller than 2 with respect to the unweighted l∞-distance associated with the family W2 of all 2 × 2 square regions.
TL;DR: In this paper, an implementation method for a single-chip 2048 complex point FFT in terms of sequential data processing is proposed and the convergent block floating point (CBFP) algorithm is used for the effective internal bit rounding.
Abstract: In this paper, we propose an implementation method for a single-chip 2048 complex point FFT in terms of sequential data processing. In order to reduce the required chip area for the sequential processing of 2 K complex data, a DRAM-like pipelined commutator architecture is used. The 16-point FFT is a basic building block of the entire FFT chip, and the 2048-point FFT consists of cascaded blocks with five stages of radix-4 and one stage of radix-2. Since each stage requires rounding of the resulting bits while maintaining the proper S/N ratio, the convergent block floating point (CBFP) algorithm is used for the effective internal bit rounding.
TL;DR: In this paper, a linear programming based approach for solving the data association problem (DAP) in multiple target tracking is presented, which uses an iterated K-scan sliding window technique to solve practical instances of the DAP.
Abstract: In this paper we present a linear programming (LP) based approach for solving the data association problem (DAP) in multiple target tracking. It is well-known that the DAP can be formulated as an integer program. We present a compact formulation of the DAP. To solve practical instances of the DAP we propose an algorithm that uses an iterated K-scan sliding window technique. In each iteration we solve the linear programming relaxation of an integer program and next apply a greedy rounding procedure. Computational experiments indicate that the quality of the solutions found is quite satisfactory.
TL;DR: To the best of the knowledge, this design is the first publication that deals with detecting exceptions and trapped overflow and underflow exceptions as an integral part of the rounding unit in a floating point unit.
Abstract: Engineering design methodology recommends designing a system as follows: Start with an unambiguous specification, partition the system into blocks, specify the functionality of each block, design each block separately, and glue the blocks together. Verifying the correctness of an implementation then reduces to a local verification procedure. We apply this methodology for designing a provably correct IEEE rounding unit that can be used for various operations, such as addition and multiplication. First, we provide a mathematical and, hopefully, unambiguous definition of the IEEE Standard which specifies the functionality. We give explicit and concise rules for gluing the rounding unit with a floating-point adder and multiplier. We then present floating-point addition and multiplication algorithms that use the rounding unit. To the best of our knowledge, our design is the first publication that deals with detecting exceptions and trapped overflow and underflow exceptions as an integral part of the rounding unit in a floating point unit. Our abstraction level avoids bit-level representations and arguments to help clarify the functionality of the algorithm.
TL;DR: An off-the-shelf package is commonly used to generate an approximating polynomial from partial sensor data, where the floating-point coefficients are rounded to the target architecture's size.
Abstract: An off-the-shelf package is commonly used to generate an approximating polynomial from partial sensor data. The floating-point coefficients are rounded to the target architecture's size. Rounding errors can actually be due to this solution space translation. A genetic algorithm can be used to find the optimal coefficient set in the restricted target space.
TL;DR: A very-high radix algorithm and implementation for CORDIC rotation in circular and hyperbolic coordinates is presented and it is shown that this assures convergence from the second iteration on.
Abstract: A very-high radix algorithm and implementation for CORDIC rotation in circular and hyperbolic coordinates is presented. The selection function consists of rounding the residual. It is shown that this assures convergence from the second iteration on. For the first iteration, the selection is done by table, using a lower radix than for the remaining iterations. The compensation of the variable scale factor is done by computing the logarithm of the scale factor and performing the compensation by an exponential. Estimations of the delay for 32-bit and 64-bit precision show a substantial speed up when compared to low radix implementations. The proposed algorithm is also compared with previously proposed very-high radix ones, and significant advantages are identified.
TL;DR: A novel technique is presented for the design of lattice structure PRQMF banks subject to discrete coefficient value constraints, where the coefficient values are quantized sequentially one at a time.
Abstract: The lattice structure quadrature mirror filter (QMF) bank structurally guarantees the perfect reconstruction (PR) property. Thus, it is eminently suitable for hardware realization even under the severe coefficient quantization condition. Nevertheless, its frequency response is still adversely affected by coefficient quantization. In this paper, a novel technique is presented for the design of lattice structure PRQMF banks subject to discrete coefficient value constraints. In our technique, the coefficient values are quantized sequentially one at a time. After each coefficient is being quantized, the remaining unquantized coefficient values are reoptimized to partially compensate for the frequency response deviation caused by the quantization of that coefficient value. The order of selection of the coefficients for quantization is based on a coefficient sensitivity measure. Coefficients with higher sensitivity measures are quantized earlier than coefficients with lower sensitivity measures. The improvement in the frequency response ripple magnitude achieved by our algorithm over that by simple rounding of coefficient values differs widely from example to example ranging from a fraction of a dB to over 10 dB.
TL;DR: This paper presents a topologically correct and efficient version of the algorithm by Guibas and Stolfi (Algorithmica 7 (1992), pp. 381-413) for the exact computation of Delaunay and power triangulations in two dimensions.
Abstract: In this paper we present a topologically correct and efficient version of the algorithm by Guibas and Stolfi (Algorithmica 7 (1992), pp. 381-413) for the exact computation of Delaunay and power triangulations in two dimensions. The algorithm avoids numerical errors and degeneracies caused by the accumulation of rounding errors in fixed length floating point arithmetic when constructing these triangulations. Most methods for computing Delaunay and power triangulations involve the calculation of two basic primitives: the INCIRCLE test and the CCW orientation test. Both primitives require the computation of the sign of a determinant. The key to our method is the exact computation of this sign and is based on an algorithm for determining the sign of the sum of a finite set of normalized floating point numbers of fixed mantissa length (machine numbers) exactly. The exact computation of the primitives allows the construction of the correct Delaunay and power triangulations. The method has been implemented and tested for the incremental construction of Delaunay and power triangulations. Tests have been conducted for different distributions of points for which non-exact algorithms may encounter difficulties, for example, slightly perturbed points on a grid or on a circle. Experimental results show that the performance of our implementation is comparable with that of a simple implementation of the incremental algorithm in single precision floating point arithmetic. For random distribution of points the exact algorithm is only 4 times slower than the inexact implementation. The algorithm is easy to implement, robust and portable as long as the input data to the algorithm remains exact.
TL;DR: A general rounding theorem is proved, which allows to construct in polynomial time 1-job approximations to the optimum, i.e. schedules with an absolute bound equal to the largest job processing time.
Abstract: We consider the problem of scheduling of n independent jobs on m unrelated machines to minimize the max(t1, t2, , tm), ti being the completion time of machine i In [1] was suggested a polynomial 2- approximation algorithm for this problem It was also proved that there can exist no polynomial 1:5-approximation algorithm unless P = NP Here we improve this earlier performance bound 2 to 2- 1/m In [1] is also proved a general rounding theorem, which allows to construct in polynomial time 1-job approximations to the optimum, ie schedules with an absolute bound equal to the largest job processing time We also improve this result and obtain (1 - 1/m)-job approximation to optimal
TL;DR: In this paper, the authors present a method and system for scheduling the egress of processed information units (or frames) from a network processing unit according to service based on minimum bandwidth specifications where position in the queue is adjusted after each service according to minimum bandwidth specificaiton and the length of frame.
Abstract: A system and method of moving information units from a network processor toward a data transmission network in a prioritized sequence which accommodates several different levels of service. The present invention includes a method and system for scheduling the egress of processed information units (or frames) from a network processing unit according to service based on minimum bandwidth specifications where position in the queue is adjusted after each service based on minimum bandwidth specificaiton and the length of frame, a process which is subject to rounding errors. To avoid the accumulation of rounding errors inequitably influencing the position of some in the queue, a system to adjust for the rounding errors adds an increased measure of fairness to the system.
TL;DR: It thus appears that one can retrieve low-to-medium degree non-linearities whose contribution to the quantizer's distortion in the relevant frequency range is of the same order as or well below the distortion due to rounding before wobbling.
TL;DR: In this article, a processor representation of a floating-point data item is converted to a representation of truncated integer items, without changing the rounding mode of a processor, when the current rounding mode is unknown.
Abstract: A processor representation of a floating-point data item is converted to a representation of a truncated integer item, without changing the rounding mode of a processor. When the current rounding mode is unknown, the floating-point item is converted to an integer representation in whatever mode the processor happens to be in. One of multiple correction values is applied, in response to the sign of the original data, a difference between the integer and the original data, and whether the item is an integer. When the current rounding mode is known, the processor produces two integer representations, and selects one or the other of them as an output integer data item, in response to the sign of the original item and the relative sizes of the two representations.
TL;DR: In this article, a method and apparatus that performs anticipatory rounding of intermediate results in a floating point arithmetic system while the intermediate results are being normalized is disclosed, which includes four logic levels, implemented in N-NARY logic.
Abstract: A method and apparatus that performs anticipatory rounding of intermediate results in a floating point arithmetic system while the intermediate results are being normalized is disclosed. One embodiment of the present invention includes four logic levels, implemented in N-NARY logic. In the first three logic levels, propagation information is gathered for preselected bit groups from the coarse and medium shift output of the normalizer as those results become available. In the fourth level, an incremented, normalized intermediate single-precision or double-precision mantissa result is produced by combining fine shift output bit values with propagation information for the appropriate top bit group, middle bit group, and bottom bit group. The appropriate bit groups are determined by examining the value of the fine shift select signal.
TL;DR: In this article, the authors present a system and method to efficiently round real numbers, which includes a rounding apparatus to accept an input value that is a real number represented in floating-point format, and to perform a rounding operation on the input value to generate an output value representing an integer.
Abstract: The present invention provides a system and method to efficiently round real numbers. The system includes a rounding apparatus to accept an input value that is a real number represented in floating-point format, and to perform a rounding operation on the input value to generate an output value that is an integer represented in floating-point format. The system also includes a memory to store a computer program that utilizes the rounding apparatus. The system further includes a central processing unit (CPU) to execute the computer program. The CPU is cooperatively connected to the rounding apparatus and the memory.
TL;DR: In this paper, a first one directional (1D) IDCT is performed resulting in a plurality of first 1D IDCT coefficients followed by a second one-dimensional IDCT, which is maintained at no more than 16-bits utilizing a round near positive (RNP) rounding scheme.
Abstract: Compressed data are decompressed using an inverse discrete cosine transform (IDCT). A first one directional (1D) IDCT is performed resulting in a plurality of first 1D IDCT coefficients followed by a second 1D IDCT resulting in a plurality of second 1D IDCT coefficients. In performing the first 1D IDCT and the second 1D IDCT a first plurality of intermediate butterfly computations are performed which include performing a plurality of intermediate multiplications resulting in a plurality of initial products and performing a plurality of intermediate additions resulting in intermediate product which are maintained at no more than 16-bits utilizing a round near positive (RNP) rounding scheme. Following the second 1D IDCT a rounding and shifting of the plurality of second 1D IDCT coefficients is performed utilizing a round away from zero (RAZ) rounding scheme resulting in a plurality of output coefficients which comply with the IEEE 1180 standard.
TL;DR: An algorithm that computes a matrix rounding with an error at most 1:75 with respect to the unweighted l∞ distance associated with the family W2 of all 2 × 2 square regions is given, whereas it is proved that it is NP-hard to compute an approximate solution to the matrix-rounding problem with an approximate ratio smaller than 2 for the same distance.
Abstract: In this paper, we discuss the problem of computing an optimal rounding of a real sequence (resp. matrix) into an integral sequence (resp. matrix). Our criterion of the optimality is to minimize the weighted l∞ distance Dist∞F, w (A, B) between an input sequence (resp. matrix) A and the output B. The distance is dependent on a family F of intervals (resp. rectangular regions) for the sequence rounding (resp. matrix rounding) and positive valued weight function w on the family. We give efficient polynomial time algorithms for the sequence-rounding problem, one for the weighted l∞ distance, and the other for any weight function w, for any family F of intervals. We give an algorithm that computes a matrix rounding with an error at most 1:75 with respect to the unweighted l∞ distance associated with the family W2 of all 2 × 2 square regions, whereas we prove that it is NP-hard to compute an approximate solution to the matrix-rounding problem with an approximate ratio smaller than 2 for the same distance.
TL;DR: Main emphasis is on the length and the rank of cutting plane proofs based on the Gomory-Chvatal rounding principle.
Abstract: Cutting planes were introduced in 1958 by Gomory in order to solve integer linear optimization problems. Since then, they have received a lot of interest, not only in mathematical optimization, but also in logic and complexity theory. In this paper, we present some recent results on cutting planes at the interface of logic and optimization. Main emphasis is on the length and the rank of cutting plane proofs based on the Gomory-Chvatal rounding principle.
TL;DR: In this article, a rounding DOT product instruction is provided, wherein a product of first pair of elements is combined with a product product of second pair of items, the combined product is rounded, and the final result is stored in a destination.
Abstract: A functional unit in a digital system is provided with a rounding DOT product instruction, wherein a product of first pair of elements is combined with a product of second pair of elements, the combined product is rounded, and the final result is stored in a destination. Rounding is performed by adding a rounding value to form an intermediate result, and then shifting the intermediate result right. A combined result is rounded to a fixed length shorter than the combined product. The products are combined by either addition or subtraction. An overflow resulting from the combination or from rounding is not reported.
TL;DR: In this paper, a floating point to fixed point converter for determining values for an n-bit frame buffer of a graphics adapter is described, where the converter includes a floating-point unit that receives a fixed-point input value and calculates a fixed point adjusted input value from the received value.
Abstract: A floating point to fixed point converter suitable for determining values for an n-bit frame buffer of a graphics adapter is disclosed. The converter includes a floating point unit that receives a floating point input value and calculates a floating point adjusted input value from the received value. Comparator circuitry is configured to compare a fixed point portion of the adjusted input value to a fixed point comparison value and to generate a fixed point output value responsive to the result of the comparison. The floating point unit may add a floating point constant to the received input to calculate the adjusted input value. The floating point constant may include a rounding component and a range component. The range component adjusts received values into a range defined by a single floating point exponent value such as the range from 1.0 to 2.0. In one embodiment, the rounding component shifts received values that are within a range of x/(2 n −1)±1/(2*2 n −1) into a range bounded by x/(2 n −1) and (x+1)/(2 n −1) to take advantage of special properties of the adjusted range's boundary values to simplify the comparator circuitry. The comparator circuitry may include a comparator configured to compare the exponent field of the adjusted input value to 0x7F and a comparator that compares the value of a first portion of the adjusted input value mantissa to a second portion of the mantissa. The no converter may produce a frame buffer value of 0x00 if the exponent field is less than 0x7F and a frame buffer value of 0xFF if the exponent field is greater than 0x7F. The converter may generate a frame buffer equal to the value of the first portion of the adjusted input value mantissa if the first portion is greater than the second portion and a frame buffer equal to the value of the first portion decremented by one if the first portion is less than the second portion.
TL;DR: A new architecture for low power floating point multiply-accumulate fusion is presented, which minimises power consumption through transition activity scaling and data path simplifications and speculative rounding.
Abstract: With the proliferation of floating point computing applications, the demand for high performance, low power floating point hardware has increased. A new architecture for low power floating point multiply-accumulate fusion is presented. The proposed architecture minimises power consumption through transition activity scaling and data path simplifications. The switching activity function of the proposed MAF is represented as a four-state FSM. During any given operation cycle, only a limited set of functional subunits are active, during which time, the logic assertion status of the circuit nodes of the unused functional units are maintained at their previous states. Critical path delay and latency are reduced by incorporating data path simplifications and speculative rounding. The scheme offers a worst case power reduction of more than 49%.