A Sparse Iteration Space Transformation Framework for Sparse Tensor Algebra
Speaker: Ryan Senanayake, Senior Engineer, Reservoir Labs, Inc.
Date: March 4, 2021
We address the problem of optimizing sparse tensor algebra in a compiler and show how to define the standard loop transformations split, collapse, and reorder on sparse iteration spaces. We further demonstrate that derived iteration spaces can tile both the universe of coordinates and the subset of nonzero coordinates. We implement these concepts by extending the sparse iteration theory implementation in the TACO system. The associated scheduling API can be used by performance engineers or it can be the target of an automatic scheduling system. We outline one heuristic autoscheduling system, but other systems are possible. Using the scheduling API, we show how to optimize mixed sparse-dense tensor algebra expressions on CPUs and GPUs. Our results show that the sparse transformations are sufficient to generate code with competitive performance to hand-optimized implementations from the literature, while generalizing to all of the tensor algebra. https://doi.org/10.1145/3428226
Ryan Senanayake is a Senior Engineer at Reservoir Labs working on the R-Stream polyhedral compiler team. Some of the projects he is currently working on include improving R-Stream’s support for deep learning models, sparse computations, GPU code generation, and exascale programming models. Ryan received his B.S. and M.E. degrees in Computer Science and Engineering from MIT in 2019 and 2020. While at MIT, Ryan performed research in the compilers group under Professor Saman Amarasinghe, in collaboration with Fred Kjolstad. As part of his thesis project, he designed and implemented a comprehensive optimization framework and GPU backend for the Sparse Tensor Algebra Compiler (TACO). This work allowed for the generation of sparse tensor algebra kernels that were competitive with hand-written kernels for both CPUs and GPUs. His thesis work was awarded the first place 2020 Charles and Jennifer Johnson Computer Science Thesis Award and published as an OOPSLA'20 paper.