Accelerated Computing research group
Best Paper Award at ICPP'25: Efficient Construction of Large Search Spaces for Auto-Tuning

September 8, 2025 — I am very excited to share that our paper “Efficient Construction of Large Search Spaces for Auto-Tuning” received the Best Paper Award at the 54th International Conference on Parallel Processing (ICPP) in San Diego!

ICPP is one of the oldest continuously running conferences in computer science, so it was an honor not only to present there, but also to have my presentation directly after the opening keynote by Turing Award laureate Jack Dongarra.

Floris-Jan Willemsen presenting at ICPP 2025

In this paper, we address a critical bottleneck in automatic performance tuning: the time it takes to build the search space of possible code versions to test. Existing methods are often slow, limited and lack formal underpinnings. We propose a principled reformulation of search space construction as a Constraint Satisfaction Problem (CSP), enabling us to leverage mature solver technologies. Our experimental evaluation demonstrates dramatic performance improvements and outperforms both unoptimized CSP solvers and leading chain-of-trees-based frameworks by substantial margins. By eliminating this scalability barrier, our work enables efficient exploration of previously intractable tuning problems and provides a drop-in solution for a broad class of applications.

I am grateful to the program committee, in particular Ilkay Altintas, Sunita Chandrasekaran and Sameer Shende, for recognizing our contribution with this award.

Abstract

Automatic performance tuning, or auto-tuning, accelerates high-performance codes by exploring vast spaces of code variants. However, due to the large number of possible combinations and complex constraints, constructing these search spaces can be a major bottleneck. Real-world applications have been encountered where the search space construction takes minutes to hours or even days. Current state-of-the-art techniques for search space construction, such as chain-of-trees, lack a formal foundation and only perform adequately on a specific subset of search spaces.

We show that search space construction for constraint-based auto-tuning can be reformulated as a Constraint Satisfaction Problem (CSP). Building on this insight with a CSP solver, we develop a runtime parser that translates user-defined constraint functions into solver-optimal expressions, optimize the solver to exploit common structures in auto-tuning constraints, and integrate these and other advances in open-source tools. These contributions substantially improve performance and accessibility while preserving flexibility.

We evaluate our approach using a diverse set of benchmarks, demonstrating that our optimized solver reduces construction time by four orders of magnitude versus brute-force enumeration, three orders of magnitude versus an unoptimized CSP solver, and one to two orders of magnitude versus leading auto-tuning frameworks built on chain-of-trees. We thus eliminate a critical scalability barrier for auto-tuning and provide a drop-in solution that enables the exploration of previously unattainable problem scales in auto-tuning and related domains.

Citation

F.J. Willemsen, R.V. van Nieuwpoort, B. van Werkhoven “Efficient Construction of Large Search Spaces for Auto-Tuning” 54th International Conference on Parallel Processing (ICPP ‘25 Best Paper Award), September 08–11, 2025, San Diego, CA, USA preprint

Written by

Floris-Jan Willemsen

Floris-Jan is a PhD Candidate at Leiden University and the Netherlands eScience Center. His research focusses on intelligent, automated optimization of GPU software.