This paper shows how to generate code that efficiently converts sparse tensors between disparate storage formats (data layouts) like CSR, DIA, ELL, and many others. We decompose sparse tensor conversion into three logical phases: coordinate remapping, analysis, and assembly. We then develop a language that precisely describes how different formats group together and order a tensor's nonzeros in memory. This enables a compiler to emit code that performs complex reorderings (remappings) of nonzeros when converting between formats. We additionally develop a query language that can extract complex statistics about sparse tensors, and we show how to emit efficient analysis code that computes such queries. Finally, we define an abstract interface that captures how data structures for storing a tensor can be efficiently assembled given specific statistics about the tensor. Disparate formats can implement this common interface, thus letting a compiler emit optimized sparse tensor conversion code for arbitrary combinations of a wide range of formats without hard-coding for any specific one.

Our evaluation shows that our technique generates sparse tensor conversion routines with performance between 0.99 and 2.2x that of hand-optimized implementations in two widely used sparse linear algebra libraries, SPARSKIT and Intel MKL. By emitting code that avoids materializing temporaries, our technique also outperforms both libraries by between 1.4 and 3.4x for CSC/COO to DIA/ELL conversion.





  title={Automatic Generation of Efficient Sparse Tensor Format Conversion Routines},
  author={Stephen Chou and Fredrik Kjolstad and Saman Amarasinghe},
  journal={ACM SIGPLAN Conference on Programming Language Design and Implementation},