Open AccessPosted Content
Accelerating Implicit Finite Difference Schemes Using a Hardware Optimized Tridiagonal Solver for FPGAs
TL;DR: A design and implementation of the Thomas algorithm optimized for hardware acceleration on an FPGA, the Thomas Core, providing an efficient and scalable accelerator for many numerical computations and investigating the use and limitations of fixed-point arithmetic in the algorithm.
read more
Abstract: We present a design and implementation of the Thomas algorithm optimized for hardware acceleration on an FPGA, the Thomas Core. The hardware-based algorithm combined with the custom data flow and low level parallelism available in an FPGA reduces the overall complexity from 8N down to 5N serial arithmetic operations, and almost halves the overall latency by parallelizing the two costly divisions. Combining this with a data streaming interface, we reduce memory overheads to 2 N-length vectors per N-tridiagonal system to be solved. The Thomas Core allows for multiple independent tridiagonal systems to be continuously solved in parallel, providing an efficient and scalable accelerator for many numerical computations. Finally we present applications for derivatives pricing problems using implicit finite difference schemes on an FPGA accelerated system and we investigate the use and limitations of fixed-point arithmetic in our algorithm.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Table 6.2 gives the expected rounding error for the number of fractional bits used in the fixed-point design. The expected rounding error ernd(x, f ), where ernd is the rounding error and f is the number of fractional bits, for rounding a floating-point number x to a fixed-point representation is given by: 
Table 3 The average time(ms) for computing the solution to tridiagonal systems (N=100) on a desktop CPU and the implemented FPGA Thomas solver. 
Table 1 FPGA resources used for each design and percentages of resources used on the Xilinx Zynq7020. 
Table 2 Clock cycle latency for each of the components of the Thomas solver core. 
Fig. 1 Data dependency graph for the forward iteration of the Thomas algorithm 
Fig. 6 Average absolute error over 5000 tridiagonal systems of the fixed-point results using 14 fractional bits with respect to floating-point results.
Citations
Performance Optimization of Tridiagonal Matrix Algorithm [TDMA] on Multicore Architectures: Computational Framework and Mathematical Modelling
Anishchandran Chathalingath,Arun Manoharan +1 more
- 01 Oct 2019
TL;DR: This paper presents a meta-modelling architecture suitable for multi-core, single-core and mixed-core computing using the TDMA/TDMA/SIMD architecture.
References
•Book
Monte Carlo Methods in Financial Engineering
Paul Glasserman
- 07 Aug 2003
TL;DR: This paper presents a meta-modelling procedure that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually computing random numbers and random Variables.
4.5K
A Fast Direct Solution of Poisson's Equation Using Fourier Analysis
TL;DR: This work has developed a direct method of solution involving Fourier analysis which can solve Poisson''s equation in a square region covered by a 48 x 48 mesh in 0.9 seconds on the IBM 7090.
715
•Book
Pricing Financial Instruments: The Finite Difference Method
Curt Randall,Domingo Tavella +1 more
- 21 Apr 2000
TL;DR: The Pricing Equations. as mentioned in this paper and the Finite-difference method are the most commonly used methods for finite difference methods in the literature, and they can be found in:
622
Developments and trends in the parallel solution of linear systems
Iain S. Duff,Henk A. van der Vorst +1 more
- 01 Dec 1999
TL;DR: This review paper considers some important developments and trends in algorithm design for the solution of linear systems concentrating on aspects that involve the exploitation of parallelism and considers preconditioning techniques for iterative solvers.