Proceedings Article10.1109/ICPADS.2015.86
Optimizing Complex Spatially-Variant Coefficient Stencils for Seismic Modeling on GPU
Jiarui Fang,Haohuan Fu,He Zhang,Wei Wu,Nanxun Dai,Lin Gan,Guangwen Yang +6 more
- 14 Dec 2015
- pp 641-648
4
TL;DR: A set of new optimization strategies for ETE stencils according to the memory hierarchy of NVIDIA GPU are presented and a stencil decomposition method is proposed to reduce un-coalesced global memory access.
read more
Abstract: The Explicit Time Evolution (ETE) method is an innovative Finite-Difference (FD) type method to simulate the wave propagation in acoustic media with higher spatial and temporal accuracy. However, different from FD, it is difficult to achieve an efficient GPU design because of the poor memory access patterns caused by the off-axis points and spatially-variant coefficients. In this paper, we present a set of new optimization strategies for ETE stencils according to the memory hierarchy of NVIDIA GPU. To handle the problem caused by the complexity of the stencil shapes, we design a one-to-multi updating scheme for shared memory usage. To alleviate the performance damage resulted from the poor memory access pattern of reading spatially-variant coefficients, we propose a stencil decomposition method to reduce un-coalesced global memory access. Based on the state-of-the-art GPU architecture, combining with existing spatial and temporal stencil blocking schemes, we manage to achieve 9.6x and 9.9x speedups compared with a well-tuned 12-core CPUs version for 37-point and 73-point ETE stencils, respectively. Compared with a well-tuned MIC version, the best speedups for the 2 type stencils are 3.7x and 4.7x. Our designs leads to an ETE method that is 31.2x faster than conventional CPU-FD method and make it a practical seismic imaging technology.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Optimization Techniques for GPU Programming
TL;DR: In this article , a survey discusses various optimization techniques found in 450 articles published in the last 14 years and analyzes the optimizations from different perspectives which shows that the various optimizations are highly interrelated, explaining the need for techniques such as auto-tuning.
54
Solutions to numerical dispersion error of time FD in RTM
Nanxun Dai,Huafeng Liu,Wei Wu +2 more
TL;DR: In this paper, the authors describe two different methods to resolve the time numerical dispersion issue: 1) a wave propagation algorithm called Explicit Time Evolution (ETE), which employs optimized stencils and coefficients to cost-effectively achieve improved accuracy, and 2) a time varying phase shift (TVPS) method applied to the input seismic traces before RTM to compensate for the dispersion from time FD.
18
A highly efficient time-space-domain optimized method with Lax-Wendroff type time discretization for the scalar wave equation
TL;DR: An efficient time-space-domain optimized (OptTS) finite difference scheme to model 2D and 3D scalar wave propagation that adopts piecewise constant interpolation coefficients for several consecutive Courant number ranges, providing a powerful tool for large-scale modeling and high-resolution imaging.
11
References
An overview of full-waveform inversion in exploration geophysics
Jean Virieux,Stéphane Operto +1 more
TL;DR: This review attempts to illuminate the state of the art of FWI by building accurate starting models with automatic procedures and/or recording low frequencies, and improving computational efficiency by data-compression techniquestomake3DelasticFWIfeasible.
Reverse time migration
TL;DR: In this article, the authors examined the alternative of carrying out the migration through a reverse time extrapolation, which may offer improvements over existing migration methods, especially in cases of steeply dipping structures with strong velocity contrasts.
A Portable Programming Interface for Performance Evaluation on Modern Processors
Shirley Browne,Jack Dongarra,N. Garner,G. Ho,Philip J. Mucci +4 more
- 01 Aug 2000
TL;DR: The purpose of the PAPI project is to specify a standard application programming interface for accessing hardware performance counters available on most modern microprocessors, which exist as a small set of registers that count events.
The finite-difference time-domain method for modeling of seismic wave propagation
TL;DR: In this article, a review of the recent development in finite-difference time-domain modeling of seismic wave propagation and earthquake motion is presented, which is a robust numerical method applicable to structurally complex media.
355
High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster
TL;DR: A high-order finite-element application, which performs the numerical simulation of seismic wave propagation resulting from earthquakes at the scale of a continent or from active seismic acquisition experiments in the oil industry, on a large cluster of NVIDIA Tesla graphics cards using the CUDA programming environment and non-blocking message passing based on MPI.
311