Top 144 papers published in the topic of Rounding in 2007

Showing papers on "Rounding published in 2007"

Journal Article•10.1145/1236463.1236468•

MPFR: A multiple-precision binary floating-point library with correct rounding

[...]

Laurent Fousse, Guillaume Hanrot, Vincent Lefèvre, Patrick Pélissier, Paul Zimmermann - Show less +1 more

01 Jun 2007-ACM Transactions on Mathematical Software

TL;DR: This article presents a multiple-precision binary floating-point library, written in the ISO C language, and based on the GNU MP library, to extend to arbitrary- Precision, ideas from the IEEE 754 standard, by providing correct rounding and exceptions.

...read moreread less

Abstract: This article presents a multiple-precision binary floating-point library, written in the ISO C language, and based on the GNU MP library. Its particularity is to extend to arbitrary-precision, ideas from the IEEE 754 standard, by providing correct rounding and exceptions. We demonstrate how these strong semantics are achieved---with no significant slowdown with respect to other arbitrary-precision tools---and discuss a few applications where such a library can be useful.

...read moreread less

1,112 citations

Journal Article•10.1007/S10107-006-0086-0•

Valid inequalities for mixed integer linear programs

[...]

Gérard Cornuéjols¹•Institutions (1)

Carnegie Mellon University¹

19 Jul 2007-Mathematical Programming

TL;DR: This tutorial introduces the necessary tools from polyhedral theory and gives a geometric understanding of several classical families of valid inequalities such as lift-and-project cuts, Gomory mixed integer cuts, mixed integer rounding cuts, split cuts and intersection cuts, and it reveals the relationships between these families.

...read moreread less

Abstract: This tutorial presents a theory of valid inequalities for mixed integer linear sets. It introduces the necessary tools from polyhedral theory and gives a geometric understanding of several classical families of valid inequalities such as lift-and-project cuts, Gomory mixed integer cuts, mixed integer rounding cuts, split cuts and intersection cuts, and it reveals the relationships between these families. The tutorial also discusses computational aspects of generating the cuts and their strength.

...read moreread less

258 citations

Journal Article•10.1016/J.DAM.2007.02.013•

On Khachiyan's algorithm for the computation of minimum-volume enclosing ellipsoids

[...]

Michael J. Todd¹, E. Alper Yıldırım²•Institutions (2)

Cornell University¹, Bilkent University²

01 Aug 2007-Discrete Applied Mathematics

TL;DR: The algorithm is a modification of the algorithm of Kumar and Yildirim, which combines Khachiyan's BCD method with a simple initialization scheme to achieve a slightly improved polynomial complexity result, and which returns a small ''core set.

...read moreread less

254 citations

Journal Article•10.1137/050641983•

Approximating K-means-type Clustering via Semidefinite Programming

[...]

Jiming Peng¹, Yu Wei¹•Institutions (1)

McMaster University¹

01 Feb 2007-Siam Journal on Optimization

TL;DR: This paper first model MSSC as a so-called 0-1 semidefinite programming (SDP) problem, and shows that this model provides a unified framework for several clustering approaches such as normalized k-cut and spectral clustering.

...read moreread less

Abstract: One of the fundamental clustering problems is to assign $n$ points into $k$ clusters based on minimal sum-of-squared distances (MSSC), which is known to be NP-hard In this paper, by using matrix arguments, we first model MSSC as a so-called 0-1 semidefinite programming (SDP) problem We show that our 0-1 SDP model provides a unified framework for several clustering approaches such as normalized k-cut and spectral clustering Moreover, the 0-1 SDP model allows us to solve the underlying problem approximately via the linear programming and SDP relaxations Second, we consider the issue of how to extract a feasible solution of the original 0-1 SDP model from the optimal solution of the relaxed SDP problem By using principal component analysis, we develop a rounding procedure to construct a feasible partitioning from a solution of the relaxed problem In our rounding procedure, we need to solve a K-means clustering problem in $\Re^{k-1}$, which can be done in $O(n^{k^2-2k+2})$ time In case of biclustering, the running time of our rounding procedure can be reduced to $O(n\log n)$ We show that our algorithm provides a 2-approximate solution to the original problem Promising numerical results for biclustering based on our new method are reported

...read moreread less

246 citations

Journal Article•10.1002/SIM.2619•

Robustness of a multivariate normal approximation for imputation of incomplete binary data.

[...]

Coen Bernaards¹, Thomas R. Belin², Joseph L. Schafer³•Institutions (3)

Genentech¹, University of California, Los Angeles², Pennsylvania State University³

15 Mar 2007-Statistics in Medicine

TL;DR: Three alternative methods for converting a multivariate normal imputation value into a binary imputed value are explored, finding that adaptive rounding provided the best performance.

...read moreread less

Abstract: Multiple imputation has become easier to perform with the advent of several software packages that provide imputations under a multivariate normal model, but imputation of missing binary data remains an important practical problem Here, we explore three alternative methods for converting a multivariate normal imputed value into a binary imputed value: (1) simple rounding of the imputed value to the nearer of 0 or 1, (2) a Bernoulli draw based on a ‘coin flip’ where an imputed value between 0 and 1 is treated as the probability of drawing a 1, and (3) an adaptive rounding scheme where the cut-off value for determining whether to round to 0 or 1 is based on a normal approximation to the binomial distribution, making use of the marginal proportions of 0's and 1's on the variable We perform simulation studies on a data set of 206 802 respondents to the California Healthy Kids Survey, where the fully observed data on 198 262 individuals defines the population, from which we repeatedly draw samples with missing data, impute, calculate statistics and confidence intervals, and compare bias and coverage against the true values Frequently, we found satisfactory bias and coverage properties, suggesting that approaches such as these that are based on statistical approximations are preferable in applied research to either avoiding settings where missing data occur or relying on complete-case analyses Considering both the occurrence and extent of deficits in coverage, we found that adaptive rounding provided the best performance Copyright © 2006 John Wiley & Sons, Ltd

...read moreread less

214 citations

Proceedings Article•10.1145/1250790.1250808•

An approximation algorithm for max-min fair allocation of indivisible goods

[...]

Arash Asadpour¹, Amin Saberi¹•Institutions (1)

Stanford University¹

11 Jun 2007

TL;DR: This paper gives the first approximation algorithm for the problem of max-min fair allocation of indivisible goods and designs an iterative method for rounding a fractional matching on a tree which might be of independent interest.

...read moreread less

Abstract: In this paper we give the first approximation algorithm for the problem of max-min fair allocation of indivisible goods. The approximation ratio of our algorithm is Ω1√k log3 k. As a part of our algorithm, we design an iterative method for rounding a fractional matching on a tree which might be of independent interest.

...read moreread less

197 citations

Journal Article•10.1007/S10732-007-9009-3•

Local search heuristics for Quadratic Unconstrained Binary Optimization (QUBO)

[...]

Endre Boros¹, Peter L. Hammer¹, Gabriel Tavares¹•Institutions (1)

Rutgers University¹

01 Apr 2007-Journal of Heuristics

TL;DR: A family of local-search-based heuristics for Quadratic Unconstrained Binary Optimization (QUBO), all of which start with a (possibly fractional) initial point, sequentially improving its quality by rounding or switching the value of one variable, until arriving to a local optimum.

...read moreread less

Abstract: We present a family of local-search-based heuristics for Quadratic Unconstrained Binary Optimization (QUBO), all of which start with a (possibly fractional) initial point, sequentially improving its quality by rounding or switching the value of one variable, until arriving to a local optimum. The effects of various parameters on the efficiency of these methods are analyzed through computational experiments carried out on thousands of randomly generated problems having 20 to 2500 variables. Tested on numerous benchmark problems, the performance of the most competitive variant (ACSIOM) was shown to compare favorably with that of other published procedures.

...read moreread less

187 citations

Patent•

Resampling and picture resizing operations for multi-resolution video coding and decoding

[...]

Gary J. Sullivan¹•Institutions (1)

Microsoft¹

8 Jan 2007

TL;DR: In this article, the authors present techniques and tools for high accuracy position calculation for picture resizing in applications such as spatially-scalable video coding and decoding, which is performed according to a resampling scale factor.

...read moreread less

Abstract: Techniques and tools for high accuracy position calculation for picture resizing in applications such as spatially-scalable video coding and decoding are described. In one aspect, resampling of a video picture is performed according to a resampling scale factor. The resampling comprises computation of a sample value at a position i, j in a resampled array. The computation includes computing a derived horizontal or vertical sub-sample position x or y in a manner that involves approximating a value in part by multiplying a 2n value by an inverse (approximate or exact) of the upsampling scale factor. The approximating can be a rounding or some other kind of approximating, such as a ceiling or floor function that approximates to a nearby integer. The sample value is interpolated using a filter.

...read moreread less

142 citations

Journal Article•10.1016/J.IPL.2006.07.009•

An improved approximation ratio for the minimum linear arrangement problem

[...]

Uriel Feige¹, James R. Lee²•Institutions (2)

Microsoft¹, Institute for Advanced Study²

16 Jan 2007-Information Processing Letters

TL;DR: It is observed that combining the techniques of Arora, Rao, and Vazirani, with the rounding algorithm of Rao and Richa yields an O(√logn log logn)-approximation for the minimum-linear arrangement problem.

...read moreread less

107 citations

Journal Article•10.1198/106186007X180336•

A Finite Smoothing Algorithm for Quantile Regression

[...]

Colin Chen

01 Mar 2007-Journal of Computational and Graphical Statistics

TL;DR: Numerical comparison shows that the finite smoothing algorithm outperforms the simplex algorithm in computing speed; however, it is significantly faster than the interior point algorithm when the design matrix in quantile regression has a large number of covariates.

...read moreread less

Abstract: This article introduces a new method for computing regression quantile functions. This method applies a finite smoothing algorithm based on smoothing the nondifferentiable quantile regression objective function ρτ. The smoothing can be done for all τ ∈ (0, 1), and the convergence is finite for any finite number of τi ∈ (0, 1), i = 1,…,N. Numerical comparison shows that the finite smoothing algorithm outperforms the simplex algorithm in computing speed. Compared with the powerful interior point algorithm, which was introduced in an earlier article, it is competitive overall; however, it is significantly faster than the interior point algorithm when the design matrix in quantile regression has a large number of covariates. Additionally, the new algorithm provides the same accuracy as the simplex algorithm. In contrast, the interior point algorithm gives only the approximate solutions in theory, and rounding may be necessary to improve the accuracy of these solutions in practice.

...read moreread less

97 citations

Journal Article•10.1016/J.TCS.2007.02.056•

A faster combinatorial approximation algorithm for scheduling unrelated parallel machines

[...]

Martin Gairing¹, Burkhard Monien¹, Andreas Woclaw¹•Institutions (1)

University of Paderborn¹

20 Jul 2007-Theoretical Computer Science

TL;DR: This paper replaces the classical technique of solving the LP-relaxation and rounding afterwards by a completely integral approach, and is the first time that a combinatorial algorithm always beats the interior point approach for this problem.

...read moreread less

Journal Article•10.1016/J.EJOR.2005.10.053•

On the existence of solutions to the quadratic mixed-integer mean–variance portfolio selection problem

[...]

Marco Corazza, Daniela Favaretto

01 Feb 2007-European Journal of Operational Research

TL;DR: This work deals with a suitably defined quadratic mixed-integer programming problem, and presents some rounding procedures for finding a feasible mixed- integer solution which is better than the one detected by the necessary and sufficient conditions in terms of the value assumed by the portfolio variance.

...read moreread less

Journal Article•10.3150/07-BEJ6067•

Are volatility estimators robust with respect to modeling assumptions

[...]

Yingying Li¹, Per A. Mykland¹•Institutions (1)

University of Chicago¹

04 Sep 2007-arXiv: Statistical Finance

TL;DR: In this article, the authors consider microstructure as an arbitrary contamination of the underlying latent securities price, through a Markov kernel, and show that, subject to smoothness conditions, the two scales realized volatility is robust to the form of contamination.

...read moreread less

Abstract: We consider microstructure as an arbitrary contamination of the underlying latent securities price, through a Markov kernel $Q$. Special cases include additive error, rounding and combinations thereof. Our main result is that, subject to smoothness conditions, the two scales realized volatility is robust to the form of contamination $Q$. To push the limits of our result, we show what happens for some models that involve rounding (which is not, of course, smooth) and see in this situation how the robustness deteriorates with decreasing smoothness. Our conclusion is that under reasonable smoothness, one does not need to consider too closely how the microstructure is formed, while if severe non-smoothness is suspected, one needs to pay attention to the precise structure and also the use to which the estimator of volatility will be put.

...read moreread less

Journal Article•10.1287/MOOR.1060.0237•

LP Rounding Approximation Algorithms for Stochastic Network Design

[...]

Anupam Gupta¹, R. Ravi¹, Amitabh Sinha²•Institutions (2)

Carnegie Mellon University¹, University of Michigan²

01 May 2007-Mathematics of Operations Research

TL;DR: A novel combination of the primal-dual method truncated based on optimal LP relaxation values, followed by a tree-rounding stage is used to give constant-factor approximation algorithms for the stochastic Steiner tree and single sink network design problems in generalized models.

...read moreread less

Abstract: We study the Steiner tree problem and the single-cable single-sink network design problem under a two-stage stochastic model with recourse and finitely many scenarios. In these models, some edges are purchased in a first stage when only probabilistic information about the second stage is available. In the second stage, one of a finite number of specified scenarios is realized, which results in the set of terminals becoming known and the opportunity to purchase additional edges (under an inflated cost function) to augment the first-stage solution. We provide constant factor approximation algorithms for these problems by rounding the linear relaxation of IP formulations of the problems. Our algorithms involve solving the linear relaxation first, followed by a primal-dual routine that is guided by the LP solution. We also show that because our bounds are local (the cost of each component is bounded by its cost in the LP solution), we are able to obtain bounds that guard against a form of downside risk.

...read moreread less

Proceedings Article•10.1109/ARITH.2007.26•

P6 Binary Floating-Point Unit

[...]

Son Dao Trong¹, Martin S. Schmookler¹, Eric M. Schwarz¹, Michael Kroener¹•Institutions (1)

IBM¹

25 Jun 2007

TL;DR: Division and square root algorithms are also described which take advantage of high-precision linear approximation hardware for obtaining a reciprocal or reciprocal square root approximation.

...read moreread less

Abstract: The floating point unit of the next generation PowerPC is detailed. It has been tested at over 5 GHz. The design supports an extremely aggressive cycle time of 13 FO4 using a technology independent measure. For most dependent instructions, its fused multiply-add dataflow has only 6 effective pipeline stages. This is nearly equivalent to its predecessor, the Power 5, even though its technology independent frequency has increased over 70%. Overall the frequency has improved over 100%. It achieves this high performance through aggressive feedback paths, circuit design and layout. The pipeline has 7 stages but data may be fed back to dependent operations prior to rounding and complete normalization. Division and square root algorithms are also described which take advantage of high-precision linear approximation hardware for obtaining a reciprocal or reciprocal square root approximation.

...read moreread less

Proceedings Article•10.1109/FOCS.2007.73•

Towards Sharp Inapproximability For Any 2-CSP

[...]

Per Austrin¹•Institutions (1)

Royal Institute of Technology¹

21 Oct 2007

TL;DR: It is shown how to reduce the search for a good inapproximability result to a certain numeric minimization problem, and conjecture that the restricted type required for the hardness result is in fact no restriction, which would imply that these upper and lower bounds match exactly.

...read moreread less

Abstract: We continue the recent line of work on the connection between semidefinite programming-based approximation algorithms and the Unique Games Conjecture. Given any-boolean 2-CSP (or more generally, any nonnegative objective function on two boolean variables), we show how to reduce the search for a good inapproximability result to a certain numeric minimization problem. The key objects in our analysis are the vector triples arising when doing clause-by-clause analysis of algorithms based on semidefinite programming. Given a weighted set of such triples of a certain restricted type, which are "hard" to round in a certain sense, we obtain a Unique Games-based inapproximability matching this "hardness" of rounding the set of vector triples. Conversely, any instance together with an SDP solution can be viewed as a set of vector triples, and we show that we can always find an assignment to the instance which is at least as good as the "hardness" of rounding the corresponding set of vector triples. We conjecture that the restricted type required for the hardness result is in fact no restriction, which would imply that these upper and lower bounds match exactly. This conjecture is supported by all existing results for specific 2-CSPs. As an application, we show that Max 2-AND is hard to approximate within 0.87435. This improves upon the best previous hardness of alphaGW + epsi ap 0.87856, and comes very close to matching the approximation ratio of the best algorithm known, 0.87401. It also establishes that balanced instances of Max 2-AND, i.e., instances in which each variable occurs positively and negatively equally often, are not the hardest to approximate, as these can be approximated within a factor alphaGW.

...read moreread less

Journal Article•10.1016/J.TCS.2007.03.006•

Approximation schemes for a class of subset selection problems

[...]

Kirk Pruhs¹, Gerhard J. Woeginger²•Institutions (2)

University of Pittsburgh¹, Eindhoven University of Technology²

20 Aug 2007-Theoretical Computer Science

TL;DR: An easily applicable algorithmic technique/tool for developing approximation schemes for certain types of combinatorial optimization problems and derives the existence of an FPTAS for the scheduling problem of minimizing the weighted number of late jobs under release dates and preemption on a single machine.

...read moreread less

Patent•

Reduction of errors during computation of inverse discrete cosine transform

[...]

Harinath Garudadri¹, Yuriy Reznik¹•Institutions (1)

Qualcomm¹

25 Jun 2007

TL;DR: In this paper, the authors proposed techniques to reduce rounding errors during computation of discrete cosine transform using fixed-point calculations, where a matrix of scaled coefficients is calculated by multiplying coefficients in a matrix by scale factors.

...read moreread less

Abstract: Techniques are described to reduce rounding errors during computation of discrete cosine transform using fixed-point calculations. According to these techniques, a discrete cosine transform a matrix of scaled coefficients is calculated by multiplying coefficients in a matrix of coefficients by scale factors. Next, a midpoint bias value and a supplemental bias value are added to a DC coefficient of the matrix of scaled coefficients. Next, an inverse discrete cosine transform is applied to the resulting matrix of scaled coefficients. Values in the resulting matrix are then right-shifted in order to derive a matrix of pixel component values. As described herein, the addition of the supplemental bias value to the DC coefficient reduces rounding errors attributable to this right-shifting. As a result, a final version of a digital media file decompressed using these techniques may more closely resemble an original version of a digital media file.

...read moreread less

Journal Article•10.1007/S11265-007-0058-5•

A Decimal Floating-Point Divider Using Newton---Raphson Iteration

[...]

Liang-Kai Wang¹, Michael J. Schulte¹•Institutions (1)

University of Wisconsin-Madison¹

1 Oct 2007

TL;DR: An efficient arithmetic algorithm and hardware design for decimal floating-point division using an efficient piecewise linear approximation, a modified Newton–Raphson iteration, a specialized rounding technique, and a simplified decimal incrementer and decrementer is presented.

...read moreread less

Abstract: Increasing chip densities and transistor counts provide more room for designers to add functionality for important application domains into future microprocessors. As a result of rapid growth in financial, commercial, and Internet-based applications, hardware support for decimal floating-point arithmetic is now being considered by various computer manufacturers and specifications for decimal floating-point arithmetic have been added to the draft revision of the IEEE-754 Standard for Floating-Point Arithmetic (IEEE P754). In this paper, we presents an efficient arithmetic algorithm and hardware design for decimal floating-point division. The design uses an efficient piecewise linear approximation, a modified Newton---Raphson iteration, a specialized rounding technique, and a simplified decimal incrementer and decrementer. Synthesis results show that a 64-bit (16-digit) implementation of the decimal divider, which is compliant with the current version of IEEE P754, has an estimated critical path delay of 0.69 ns (around 13 FO4 inverter delays) when implemented using LSI Logic's 0.11 micron Gflx-P standard cell library.

...read moreread less

Journal Article•

A new approximation algorithm for the multilevel facility location problem

[...]

Adriana F. Gabor¹, Jan-Kees van Ommeren²•Institutions (2)

Erasmus University Rotterdam¹, University of Twente²

01 Dec 2007-Memorandum (institute of Pacific Relations, American Council)

TL;DR: A new integer programming formulation for the multilevel facility location problem and a novel 3-approximation algorithm based on LP-rounding that is more efficient than the one commonly used in the approximation algorithms for these types of problems.

...read moreread less

Abstract: In this paper we propose a new integer programming formulation for the multi-level facility location problem and a novel 3-approximation algorithm based on LP rounding. The linear program we are using has a polynomial number of variables and constraints, being thus more efficient than the one commonly used in the approximation algorithms for this type of problems.

...read moreread less

Proceedings Article•10.1109/ASAP.2007.4429967•

Hardware Design of a Binary Integer Decimal-based IEEE P754 Rounding Unit

[...]

Charles Tsen¹, Michael J. Schulte¹, Sonia Gonzalez-Navarro²•Institutions (2)

University of Wisconsin-Madison¹, University of Málaga²

9 Jul 2007

TL;DR: This paper presents a hardware design for a rounding unit for 64-bit DFP numbers (decimal 64) that use the IEEE P754 binary encoding of DFPNumbers, which is widely known as the Binary Integer Decimal (BID) encoding.

...read moreread less

Abstract: Because of the growing importance of decimal floating-point (DFP) arithmetic, specifications for it were recently added to the draft revision of the IEEE 754 Standard (IEEE P754). In this paper, we present a hardware design for a rounding unit for 64-bit DFP numbers (decimal 64) that use the IEEE P754 binary encoding of DFP numbers, which is widely known as the Binary Integer Decimal (BID) encoding. We summarize the technique used for rounding, present the theory and design of the BID rounding unit, and evaluate its critical path delay, latency, and area for combinational and pipelined designs. Over 86% of the rounding unit's area is due to a 55-bit by 54-bit binary multiplier, which can be shared with a double-precision binary floating-point multiplier. To our knowledge, this is the first hardware design for rounding IEEE P754 BID-encoded DFP numbers.

...read moreread less

Book Chapter•10.1007/978-3-540-74171-8_96•

Optimal components selection for analog active filters using clonal selection algorithms

[...]

Min Jiang¹, Zhenkun Yang¹, Zhaohui Gan¹•Institutions (1)

Wuhan University of Science and Technology¹

21 Aug 2007

TL;DR: Clonal Selection Algorithms is applied into searching optimal components for 4th order Butterworth filter design and results demonstrate that the proposed method is much superior to the conventional means.

...read moreread less

Abstract: In design and realization of analog electronic circuit, we usually use preferred value components, the performance of practical circuits often deviate from the ideal design target due to rounding the calculated component values to preferred ones. The best combination of the preferred value components exists in general, but the searching space of all combinations of preferred-value components is very huge. Clonal Selection Algorithms (CSA) is a widely used approach for handling optimization problems. In this paper, CSA is applied into searching optimal components for 4th order Butterworth filter design. Simulation results demonstrate that the proposed method is much superior to the conventional means. This method also can be applied into other types of filter design.

...read moreread less

Proceedings Article•10.1109/ICCD.2007.4601914•

A radix-10 SRT divider based on alternative BCD codings

[...]

Alvaro Vazquez¹, Elisardo Antelo¹, Paolo Montuschi², Paolo Montuschi¹•Institutions (2)

University of Santiago de Compostela¹, Polytechnic University of Turin²

1 Oct 2007

TL;DR: The rough area-delay estimations performed show that the proposed radix-10 floating-point divider has a similar latency but less hardware complexity than a recently published high performance digit-by-digit implementation.

...read moreread less

Abstract: In this paper we present the algorithm and architecture a radix-10 floating-point divider based on an SRT non-restoring digit-by-digit algorithm. The algorithm uses conventional techniques developed to speed-up radix-2k division such as signed-digit (SD) redundant quotient and digit selection by constant comparison using a carry-save estimate of the partial remainder. To optimize area and latency for decimal, we include novel features such as the use of alternative BCD codings to represent decimal operands, estimates by truncation at any binary position inside a decimal digit, a single customized fast carry propagate decimal adder for partial remainder computation, initial odd multiple generation and final normalization with rounding, and register placement to exploit advanced high fanin mux-latch circuits. The rough area-delay estimations performed show that the proposed divider has a similar latency but less hardware complexity (1.3 area ratio) than a recently published high performance digit-by-digit implementation.

...read moreread less

Journal Article•10.1016/J.CAM.2005.07.038•

Super-fast validated solution of linear systems

[...]

Siegfried M. Rump¹, Takeshi Ogita²•Institutions (2)

Hamburg University of Technology¹, Waseda University²

15 Feb 2007-Journal of Computational and Applied Mathematics

TL;DR: This paper presents a super-fast validation algorithm for linear systems with symmetric positive definite matrix that means that the entire computing time for the validation algorithm including computation of an approximated solution is the same as for a standard numerical algorithm.

...read moreread less

Proceedings Article•10.1109/ICCD.2007.4601915•

Hardware design of a Binary Integer Decimal-based floating-point adder

[...]

Charles Tsen¹, Sonia Gonzalez-Navarro², Michael J. Schulte¹•Institutions (2)

University of Wisconsin-Madison¹, University of Málaga²

1 Oct 2007

TL;DR: This paper presents a novel algorithm and hardware design for a DFP adder that performs addition and subtraction on 64-bit operands that use the IEEE P754 binary encoding of DFP numbers, widely known as the binary integer decimal (BID) encoding.

...read moreread less

Abstract: Because of the growing importance of decimal floating-point (DFP) arithmetic, specifications for it are included in the IEEE Draft Standard for Floating-point Arithmetic (IEEE P754). In this paper, we present a novel algorithm and hardware design for a DFP adder. The adder performs addition and subtraction on 64-bit operands that use the IEEE P754 binary encoding of DFP numbers, widely known as the binary integer decimal (BID) encoding. The BID adder uses a novel hardware component for decimal digit counting and an enhanced version of a previously published BID rounding unit. By adding more sophisticated control, operations are performed with variable latency to optimize for common cases. We show that a BID-based DFP adder design can be achieved with a modest area increase compared to a single 2-stage pipelined 64-bit fixed-point multiplier. Over 70% of the BID adderpsilas area is due the 64-bit fixed-point multiplier, which can be shared with a binary floating-point multiplier and hardware for other DFP operations. To our knowledge, this is the first hardware design for adding and subtracting IEEE P754 BID-encoded DFP numbers.

...read moreread less

An FPGA implementation of pipelined multiplicative division with

[...]

Ronen Goldberg, Guy Even

1 Jan 2007

TL;DR: An FPGA implementation of double precision floating-point division with IEEE rounding with a total latency that is 2:6 times smaller than the latency of the fastest previous implementation on FPGAs is reported.

...read moreread less

Abstract: We report the results of an FPGA implementation of double precision oating-point division with IEEE rounding. We achieve a total latency (i.e., cycles times clock period) that is 2:6 times smaller than the latency of the fastest previous implementation on FPGAs. The amount of hardware, on the other hand, is comparable to commercial cores. The division circuit is based on Goldschmidt’s algorithm. All IEEE rounding modes are supported and are implemented using dewpoint rounding. The precision of the initial approximation of the reciprocal is 14 bits. To save hardware and reduce the critical path, a half-sized 6230 Booth radix-8 multiplier is used. This multiplier can receive both the multiplicand and the multiplier in carry-save representation. The division circuit is partitioned into four pipeline stages, has a latency of 11 cycles, and may restart a new double precision division operation after 8 cycles. Synthesis results of an implementation (not including the computation of the initial approximation of the reciprocal and the exponent path) guarantee a clock frequency of 131 MHz on an Altera Stratix II using 3592 ALMs. The implementation was successfully tested with over 10 million random vectors as well as over a million hard-to-round vectors.

...read moreread less

Journal Article•10.1007/S10479-007-0171-7•

Lifting, superadditivity, mixed integer rounding and single node flow sets revisited

[...]

Quentin Louveaux¹, Laurence A. Wolsey¹•Institutions (1)

Université catholique de Louvain¹

05 May 2007-Annals of Operations Research

TL;DR: A unified presentation of a variety of results on the lifting of valid inequalities, as well as a standard procedure combining mixed integer rounding with lifting for the development of strong valid inequalities for knapsack and single node flow sets are presented.

...read moreread less

Abstract: In this survey we attempt to give a unified presentation of a variety of results on the lifting of valid inequalities, as well as a standard procedure combining mixed integer rounding with lifting for the development of strong valid inequalities for knapsack and single node flow sets. Our hope is that the latter can be used in practice to generate cutting planes for mixed integer programs. The survey contains essentially two parts. In the first we present lifting in a very general way, emphasizing superadditive lifting which allows one to lift simultaneously different sets of variables. In the second, our procedure for generating strong valid inequalities consists of reduction to a knapsack set with a single continuous variable, construction of a mixed integer rounding inequality, and superadditive lifting. It is applied to several generalizations of the 0–1 single node flow set.

...read moreread less

Journal Article•10.1109/TCSVT.2007.896620•

Reference Frame Optimization for Multiple-Path Video Streaming With Complexity Scaling

[...]

Gene Cheung¹, Wai-Tian Tan¹, C. Chan¹•Institutions (1)

Hewlett-Packard¹

01 Jun 2007-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper presents two optimization algorithms that solve the optimization problem of jointly selecting the best set of reference frames and their associated transport QoS levels in a multipath streaming setting globally optimally and locally optimally with lower complexity.

...read moreread less

Abstract: Recent video coding standards such as H.264 offer the flexibility to select reference frames during motion estimation for predicted frames. In this paper, we study the optimization problem of jointly selecting the best set of reference frames and their associated transport QoS levels in a multipath streaming setting. The application of traditional Lagrangian techniques to this optimization problem suffers from either bounded worst case error but high complexity or low complexity but undetermined worst case error. Instead, we present two optimization algorithms that solve the problem globally optimally with high complexity and locally optimally with lower complexity. We then present rounding methods to further reduce computation complexity of the second dynamic programming-based algorithm at the expense of degrading solution quality. Results show that our low-complexity dynamic programming algorithm achieves results comparable to the optimal but high-complexity algorithm, and that gradual tradeoff between complexity and optimization quality can be achieved by our rounding techniques

...read moreread less

More Instruction Level Parallelism Explains the Actual Efficiency of Compensated Algorithms

[...]

Philippe Langlois, Nicolas Louvet

24 Jul 2007

TL;DR: This paper reports numerical experiments to exhibit that the compensated Horner algorithm runs at least twice as fast as the double-double one on modern processors and proposes to explain such efficiency by identifying more instruction level parallelism in the compensated implementation.

...read moreread less

Abstract: The compensated Horner algorithm and the Horner algorithm with double-double arithmetic improve the accuracy of polynomial evaluation in IEEE-754 floating point arithmetic. Both yield a polynomial evaluation as accurate as if it was computed with the classic Horner algorithm in twice the working precision. Both algorithms also share the same low-level computation of the floating point rounding errors and cost a similar number of floating point operations. We report numerical experiments to exhibit that the compensated algorithm runs at least twice as fast as the double-double one on modern processors. We propose to explain such efficiency by identifying more instruction level parallelism in the compensated implementation. Such property also applies to other compensated algorithms for summation, dot product and triangular linear system solving. More generally this paper illustrates how this kind of performance analysis may be useful to highlight the actual efficiency of numerical algorithms.

...read moreread less

Book Chapter•10.1007/978-0-387-71607-7_15•

Integer Dea Models

[...]

Sebastián Lozano, Gabriel Villa

1 Jan 2007

TL;DR: In this chapter, a general framework to handle integer inputs and outputs is presented and a number of integer DEA models reviewed and the working of the proposed approach is illustrated with on a problem from the literature.

...read moreread less

Abstract: Conventional Data Envelopment Analysis (DEA) models consider that inputs and outputs are continuous (ie real-valued) amounts However, there are many applications in which one or more inputs and/or outputs are necessarily integer quantities Commonly, in these situations, the non-integer targets are rounded off However, rounding off may easily lead to an infeasible target (ie out of the Production Possibility Set) or to a dominated operation point In this chapter, a general framework to handle integer inputs and outputs is presented and a number of integer DEA models reviewed We illustrate the working of the proposed approach with on a problem from the literature

...read moreread less

...

Expand