Tuning Sparse Matrix Kernel Performance via Lightweight Signatures

Loading...
Thumbnail Image
Author(s)
Jain, Anirudh
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
School of Computer Science
School established in 2007
Supplementary to:
Abstract
This dissertation introduces lightweight sparse-matrix pattern and occupancy signatures for automatic input-dependent tiling of sparse kernels such as sparse-dense matrix multiplication (SpMM), sampled dense-dense matrix multiplication (SDDMM), etc. Sparse matrix kernels are typically bandwidth limited and can benefit from optimizations such as tiling to improve cache effectiveness. However, tiling sparse kernels is challenging. Irregular sparse matrices often present intra-matrix variations in the distribution and structure of non-zeros and a one-size-fits all approach to tiling these kernels can result in sub-optimal performance. I present lightweight signatures, Residues, that use down-sampling techniques and bit-vectors to capture this irregularity. Residues capture non-zero occupancy and structure of rectangular regions of the sparse-matrix plane in such a manner that combinations of these allow the evaluation of arbitrarily larger regions of the sparse matrix. I demonstrate how Residues can be used for making intelligent tiling decisions that are both data reuse- and data-movement-aware, tiling, specifically for single sparse matrix kernels like SpMM and SDDMM. These tiling techniques, ResGeMM and RASSM, greedily combine residue entries to analyze different tile shapes and generate tiles with a high cache volume footprint and data-reuse potential to improve performance. The maximum cache resident volume (temporal volume) of sparse kernels varies during execution, and statically determining this is not straightforward. I make the observation that the temporal volume problem for single-sparse-matrix-kernels is input-dependent and can be mapped to the maximum overlapping interval analysis problem. I augment RASSM with the ability to leverage this analysis for improved tiling. This results in higher performance over static-spatial techniques and other state-of-the-art sparse tiling methods. Finally, this dissertation demonstrates the use of signatures in tiling sparse-sparse matrix multiplication (SpGeMM) when hardware accelerators are used for the partial product reduction phase of the algorithm.
Sponsor
Date
2025-01-21
Extent
Resource Type
Text
Resource Subtype
Dissertation
Rights Statement
Rights URI