A Probabilistic Perspective on Tiling Sparse Tensor Algebra
Abstract
Sparse tensor algebra computations are often memory-bound due to irregular access patterns and low arithmetic intensity. We present D2T2 (Data-Driven Tensor Tiling), a framework that optimizes static coordinate-space tiling schemes to minimize memory traffic by identifying and leveraging relevant high-level statistics from input operands. For a given tensor algebra computation, D2T2 collects statistics from input tensors, builds a probability distribution-based model of the tensor computation, and uses it to predict traffic for various tiling configurations. It searches over tile shape and size configurations to minimize total traffic. We evaluate D2T2 against Tailors and DRT, two state of the art tiling schemes for sparse tensor algebra. We find that D2T2 achieves, on average, a 2.54× speedup over Tailors and a 1.13× lower memory bandwidth compared to DRT for sparse-sparse matrix multiplication (SpMSpM). We also achieve 1.22–48.94× lower bandwidth for SpMSpM and up to 34.31× lower bandwidth for tensor operations (TTM and MTTKRP) than conservative static tiling schemes. Unlike prior tiling techniques, D2T2 is deployable without specialized hardware support. On Opal, a 16nm sparse tensor algebra accelerator, D2T2 generated tiling configurations that achieve 1.23–3.34 × speedups compared to their original hand-tuned configurations.
BibTeX
@article{sharma2025,
title={A Probabilistic Perspective on Tiling Sparse Tensor Algebra},
author={Ritvik Sharma and Zi Xue and Nathan Zhang and Rubens Lacouture and Fredrik Kjolstad and Sara Achour and Mark Horowitz},
journal={Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture},
year={2025},
month={October}
}