REPTILE: Performant Tiling of Recurrences
Abstract
We introduce REPTILE, a compiler that performs tiling optimizations for programs expressed as mathematical recurrence equations. REPTILE recursively decomposes a recurrence program into a set of unique tiles and then simplifies each into a different set of recurrences. Given declarative user specifications of recurrence equations, optimizations, and optional mappings of recurrence subexpressions to external libraries calls, REPTILE generates C code that composes compiler-generated loops with calls to external hand-optimized libraries. We show that for direct linear solvers expressible as recurrence equations, the generated C code matches and often exceed the performance of standard hand-optimized libraries. We evaluate REPTILE's generated C code against hand-optimized implementations of linear solvers in Intel MKL, as well as two non-solver recurrences from bioinformatics: Needleman-Wunsch and Smith-Waterman. When the user provides good tiling specifications, REPTILE achieves parity with MKL, achieving between 0.79-1.27x speedup for the LU decomposition, 0.97-1.21x speedup for the Cholesky decomposition, 1.61x-2.72x for lower triangular matrix inversion, 1.01-1.14x speedup for triangular solve with multiple right-hand sides, and 1.14-1.73x speedup over handwritten implementations of the bioinformatics recurrences.
BibTeX
@article{usman2025,
title={REPTILE: Performant Tiling of Recurrences},
author={Muhammad Usman Tariq and Shiv Sundram and Fredrik Kjolstad},
journal={Proceedings of the ACM on Programming Languages},
volume={9},
issue={OOPSLA},
year={2025},
month={October}
}