Computer Science Faculty Research & Creative Works

CORE-PERIPHERY PRINCIPLE GUIDED REDESIGN OF SELF-ATTENTION IN TRANSFORMERS

Xiaowei Yu, Missouri University of Science and TechnologyFollow
Lu Zhang
Haixing Dai
Yanjun Lyu
Lin Zhao
Zihao Wu
David Liu
Tianming Liu
Daijiang Zhu

Abstract

Designing more efficient, reliable, and explainable neural network architectures is critical to studies that are based on artificial intelligence (AI) techniques. Numerous efforts have been devoted to exploring the best structures, or structural signatures, of well-performing artificial neural networks (ANN). Previous studies, by post-hoc analysis, have found that the best-performing ANNs surprisingly resemble biological neural networks (BNN), which indicates that ANNs and BNNs may share some common principles to achieve optimal performance in either machine learning or cognitive/behavior tasks. Inspired by this phenomenon, rather than relying on post-hoc schemes, we proactively instill organizational principles of BNNs to guide the redesign of ANNs. We leverage the Core-Periphery (CP) organization, which is widely found in human brain networks, to guide the information communication mechanism in the self-attention of vision transformer (ViT) and name this novel framework as CP-ViT. In CP-ViT, the attention operation between nodes (image patches) is defined by a sparse graph with a Core- Periphery structure (CP graph), where the core nodes are redesigned and reorganized to play an integrative role and serve as a center for other periphery nodes to exchange information. In addition, a novel patch redistribution strategy enables the core nodes to screen out task- irrelevant patches, allowing them to focus on patches that are most relevant to the task. We evaluated the proposed CP-ViT on multiple public datasets, including medical image datasets (INbreast) and natural image datasets (CIFAR-10, CIFAR-100, and TinyImageNet). Interestingly, by incorporating the BNN-derived principle (CP structure) into the redesign of ViT, our CP-ViT outperforms other state-of-the- art ANNs. In general, our work advances the state of the art in three aspects: 1) This work provides novel insights for brain-inspired AI: we can utilize the principles found in BNNs to guide and improve our ANN architecture design; 2) We show that there exist sweet spots of CP graphs that lead to CP-ViTs with significantly improved performance; and 3) The core nodes in CP-ViT correspond to task-related meaningful and important image patches, which can significantly enhance the interpretability of the trained deep model. (Code is ready for release).

Recommended Citation

X. Yu et al., "CORE-PERIPHERY PRINCIPLE GUIDED REDESIGN OF SELF-ATTENTION IN TRANSFORMERS," arXiv, Mar 2023.

Department(s)

Computer Science

Keywords and Phrases

Self-Attention, Core-Periphery, Transformers

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Creative Commons Licensing

This work is licensed under a Creative Commons Attribution 4.0 License.

Publication Date

27 March, 2023

Download

Included in

Computer Sciences Commons

COinS

Computer Science Faculty Research & Creative Works

CORE-PERIPHERY PRINCIPLE GUIDED REDESIGN OF SELF-ATTENTION IN TRANSFORMERS

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

Document Type

Document Version

File Type

Language(s)

Rights

Creative Commons Licensing

Publication Date

Included in

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Computer Science Faculty Research & Creative Works

CORE-PERIPHERY PRINCIPLE GUIDED REDESIGN OF SELF-ATTENTION IN TRANSFORMERS

Author

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

Document Type

Document Version

File Type

Language(s)

Rights

Creative Commons Licensing

Publication Date

Included in

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations