Publication: Techniques for Autotuning Algorithms on Heterogenous Platforms
Loading...
Identifiers
Publication date
2016-02
Defense date
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Current GPUs (Graphic Processing Units) can obtain high computational performance in scientific applications.
Nevertheless, programmers have to use suitable parallel algorithms for these architectures and have to consider
optimization techniques in the implementation in order to achieve that performance. This thesis is focused on
designing and implementing parallel prefix algorithms into GPU architectures with little effort. For that, we have
developed a very optimized library called BPLG (Tuning Butterfly Processing Library for GPUs) and based on a set
of building blocks that enable to easily design well-known algorithms such as FFT, tridiagonal systems solvers, scan
operator, sorting or signal processing. This library is designed under a tuning methodology based on two-stages
indentified as GPU resource analysis and operator string manipulation. Specifically, this strategy is focused on a
set of parallel prefix algorithms that can be represented according to a set of common permutations of the digits
of each of its element indices [4], denoted as Index-Digit (ID) algorithms. So far, the proposed methodology has
obtained very good results with respect to state-of-art libraries, as CUFFT, CUSPARSE, CUDPP or ModernGPU.
Description
Proceedings of the First PhD Symposium on Sustainable Ultrascale
Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.
Keywords
CUDA, Parallel prefix algorithms, GPU, ID-algorithms, Tuning
Bibliographic citation
Carretero Pérez, Jesús; et.al. (eds.). (2016). Proceedings of the First PhD Symposium on Sustainable UltrascaleComputing Systems (NESUS PhD 2016). Timisoara, Romania. Universidad Carlos III de Madrid, ARCOS. Pp. 25-28.