DiamondTorre Algorithm for High-Performance Wave Modeling
Effective algorithms of physical media numerical modeling problems solution are discussed. Computation rate of such problems is limited by memory bandwidth if implemented with traditional algorithms.
The numerical solution of wave equation is considered. Finite difference scheme with cross stencil and high order of approximation is used.
The DiamondTorre algorithm is constructed, with regard for the specifics of GPGPU's (general purpose graphical processing unit) memory hierarchy and parallelism.
The advantages of these algorithms are high level of data localization as well as the property of asynchrony, which allows to effectively utilize all levels of GPGPU parallelism.
Computational intensity of the algorithm is greater than the one for the best traditional algorithms with stepwise synchronization. As a consequence, it becomes possible to overcome the above-mentioned limitation.
The algorithm is implemented with CUDA.
For the scheme with second order of approximation the calculation performance of 50 billion cells per second is achieved, which exceeds the result of the best traditional algorithm by a factor of 5.
GPGPU, wave modelling, LRnLA algorithms
Mathematical modelling in actual problems of science and technics