Accelerated Computing research group
Benchmarking Optimization Algorithms for Auto-Tuning GPU Kernels

Transactions on Evolutionary Computation (TECV) has published our paper on Benchmarking Optimization Algorithms for Auto-Tuning GPU Kernels. This paper uses Kernel Tuner to benchmark 16 different optimization algorithms for auto-tuning. We also introduce the concept of a Fitness Flow Graph (FFG) that captures some of the structure of the auto-tuning search space. The FFG allows us to calculate how what paths lead to each point in the space. We use this to define a novel metric that provides a measure of how difficult it is for an optimization algorithm to find a near-optimal configuration in the search space.


Recent years have witnessed phenomenal growth in the application, and capabilities of graphical processing units (GPUs) due to their high parallel computation power at relatively low cost. However, writing a computationally efficient GPU program (kernel) is challenging and, generally, only certain specific kernel configurations lead to significant increases in performance. Auto-tuning is the process of automatically optimizing software for highly efficient execution on a target hardware platform. Auto-tuning is particularly useful for GPU programming, as a single kernel requires retuning after code changes, for different input data, and for different architectures. However, the discrete and nonconvex nature of the search space creates a challenging optimization problem. In this work, we investigate which algorithm produces the fastest kernels if the time-budget for the tuning task is varied. We conduct a survey by performing experiments on 26 different kernel spaces, from nine different GPUs, for 16 different evolutionary black-box optimization algorithms. We then analyze these results and introduce a novel metric based on the PageRank centrality concept as a tool for gaining insight into the difficulty of the optimization problem. We demonstrate that our metric correlates strongly with the observed tuning performance.


Benchmarking optimization algorithms for auto-tuning GPU kernels
R. Schoonhoven, B. van Werkhoven, K. J. Batenburg
IEEE Transactions on Evolutionary Computation

Written by

Richard Schoonhoven

Richard is currently a PhD candidate with the Centrum Wiskunde and Informatica, and the Leiden Institute of Advanced Computer Science. His work focuses on optimizing high-throughput imaging pipelines, both on the software side with gradient-based and blind optimization algorithms, and on the hardware side by tuning GPU kernels. He also works with researchers at the European Synchrotron Radiation Facility, Grenoble, France, and on the low-frequency array pipeline.