Accelerated Auto-Tuning of GPU Kernels for Tensor Computations
In the proceedings of the 38th ACM International Conference on Supercomputing, ICS 2024, Kyoto, Japan, June 04–07, 2024.
Citation:
Li, Chendi, Yufan Xu, Sina Mahdipour Saravani, and Ponnuswamy Sadayappan. "Accelerated Auto-Tuning of GPU Kernels for Tensor Computations." In Proceedings of the 38th ACM International Conference on Supercomputing, pp. 549-561. 2024.
Download here: https://dl.acm.org/doi/pdf/10.1145/3650200.3656626