Computer Science Faculty Research & Creative Works

TSM2: Optimizing Tall-And-Skinny Matrix-Matrix Multiplication on GPUs

Jieyang Chen
Nan Xiong
Xin Liang, Missouri University of Science and TechnologyFollow
Dingwen Tao
Sihuan Li
Kaiming Ouyang
For full list of authors, see publisher's website.

Abstract

Linear algebra operations have been widely used in big data analytics and scientific computations. Many works have been done on optimizing linear algebra operations on GPUs with regular-shaped input. However, few works are focusing on fully utilizing GPU resources when the input is not regular-shaped. Current optimizations lack of considering fully utilizing the memory bandwidth and computing power, therefore they could only achieve sub-optimal performance. In this paper, we propose a performant tall-and-skinny matrix-matrix multiplication algorithm on GPUs - TSM2. It focuses on optimizing linear algebra operation with none regular-shaped input. We implement the proposed algorithm and test on three different Nvidia GPU micro-architectures: Kepler, Maxwell, and Pascal. Experiments show that our TSM2 speedups the computation by 1.1x - 3x, improves memory bandwidth utilization by 8% - 47.6%, and improves computing power utilization by 7% - 37.3% comparing to the current state-of-the-art works. We replace the original matrix operations in K-means and Algorithm-Bases Fault Tolerance (ABFT) with TSM2 and achieve up to 1.89x and 1.90x speed up.

Recommended Citation

J. Chen et al., "TSM2: Optimizing Tall-And-Skinny Matrix-Matrix Multiplication on GPUs," Proceedings of the ACM International Conference on Supercomputing (2019, Phoenix, AZ), pp. 106 - 116, Association for Computing Machinery (ACM), Jun 2019.

The definitive version is available at https://doi.org/10.1145/3330345.3330355

Meeting Name

ACM International Conference on Supercomputing (2019: Jun. 26-28, Phoenix, AZ)

Department(s)

Computer Science

Comments

This work was supported by National Science Foundation CCF 1513201 and National Key Research and Development Programs No. 2017YFB0202100.

Keywords and Phrases

GEMM; GPU; Matrix-Matrix Multiplication; Optimization; Tall-And-Skinny

International Standard Book Number (ISBN)

978-145036079-1

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

26 Jun 2019

Link to Full Text

COinS

Computer Science Faculty Research & Creative Works

TSM2: Optimizing Tall-And-Skinny Matrix-Matrix Multiplication on GPUs

Abstract

Recommended Citation

Meeting Name

Department(s)

Comments

Keywords and Phrases

International Standard Book Number (ISBN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Computer Science Faculty Research & Creative Works

TSM2: Optimizing Tall-And-Skinny Matrix-Matrix Multiplication on GPUs

Author

Abstract

Recommended Citation

Meeting Name

Department(s)

Comments

Keywords and Phrases

International Standard Book Number (ISBN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations