Computer Science Faculty Research & Creative Works

Real-time Core-Periphery Guided ViT With Smart Data Layout Selection On Mobile Devices

Zhihao Shu
Xiaowei Yu, Missouri University of Science and TechnologyFollow
Zihao Wu
Wenqi Jia
Yinchen Shi
Miao Yin
Tianming Liu
Dajiang Zhu
Wei Niu

Abstract

Mobile devices have become essential enablers for AI applications, particularly in scenarios that require real-time performance. Vision Transformer (ViT) has become a fundamental cornerstone in this regard due to its high accuracy. Recent efforts have been dedicated to developing various transformer architectures that offer improved accuracy while reducing the computational requirements. However, existing research primarily focuses on reducing the theoretical computational complexity through methods such as local attention and model pruning, rather than considering realistic performance on mobile hardware. Although these optimizations reduce computational demands, they either introduce additional overheads related to data transformation (e.g., Reshape and Transpose) or irregular computation/data-access patterns. These result in significant overhead on mobile devices due to their limited bandwidth, which even makes the latency worse than vanilla ViT on mobile. In this paper, we present ECP-ViT, a real-time framework that employs the core-periphery principle inspired by the brain functional networks to guide self-attention in ViTs and enable the deployment of ViT models on smartphones. We identify the main bottleneck in transformer structures caused by data transformation and propose a hardware-friendly core-periphery guided self-attention to decrease computation demands. Additionally, we design the system optimizations for intensive data transformation in pruned models. ECP-ViT, with the proposed algorithm-system co-optimizations, achieves a speedup of 4.6x to 26.9x on mobile GPUs across four datasets: STL-10, CIFAR100, TinyImageNet, and ImageNet.

Recommended Citation

Z. Shu et al., "Real-time Core-Periphery Guided ViT With Smart Data Layout Selection On Mobile Devices," Advances in Neural Information Processing Systems, vol. 37, Neural Information Processing Systems Foundation, Jan 2024.

Department(s)

Computer Science

Comments

National Science Foundation, Grant CCF-2428108

International Standard Serial Number (ISSN)

1049-5258

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Jan 2024

This document is currently not available here.

COinS

Computer Science Faculty Research & Creative Works

Real-time Core-Periphery Guided ViT With Smart Data Layout Selection On Mobile Devices

Abstract

Recommended Citation

Department(s)

Comments

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Computer Science Faculty Research & Creative Works

Real-time Core-Periphery Guided ViT With Smart Data Layout Selection On Mobile Devices

Author

Abstract

Recommended Citation

Department(s)

Comments

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations