Electrical and Computer Engineering Faculty Research & Creative Works

A Comprehensive Survey on Model Compression and Acceleration

Alternative Title

A comprehensive survey on model-based compression and acceleration

Tejalal Choudhary
Vipul Mishra
Anurag Goswami
Jagannathan Sarangapani, Missouri University of Science and TechnologyFollow

Abstract

In recent years, machine learning (ML) and deep learning (DL) have shown remarkable improvement in computer vision, natural language processing, stock prediction, forecasting, and audio processing to name a few. The size of the trained DL model is large for these complex tasks, which makes it difficult to deploy on resource-constrained devices. For instance, size of the pre-trained VGG16 model trained on the ImageNet dataset is more than 500 MB. Resource-constrained devices such as mobile phones and internet of things devices have limited memory and less computation power. For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. It is a challenging task to retain the same accuracy after compressing the model. To address this challenge, in the last couple of years many researchers have suggested different techniques for model compression and acceleration. In this paper, we have presented a survey of various techniques suggested for compressing and accelerating the ML and DL models. We have also discussed the challenges of the existing techniques and have provided future research directions in the field.

Recommended Citation

T. Choudhary et al., "A Comprehensive Survey on Model Compression and Acceleration," Artificial Intelligence Review, vol. 53, pp. 5113 - 5155, Springer, Oct 2020.

The definitive version is available at https://doi.org/10.1007/s10462-020-09816-7

Department(s)

Electrical and Computer Engineering

Research Center/Lab(s)

Intelligent Systems Center

Keywords and Phrases

CNN; Deep Learning; Efficient Neural Networks; Machine Learning; Model Compression and Acceleration; Resource-Constrained Devices; RNN

International Standard Serial Number (ISSN)

0269-2821; 1573-7462

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Oct 2020

Link to Full Text

COinS

See more details

Electrical and Computer Engineering Faculty Research & Creative Works

A Comprehensive Survey on Model Compression and Acceleration

Alternative Title

Abstract

Recommended Citation

Department(s)

Research Center/Lab(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Electrical and Computer Engineering Faculty Research & Creative Works

A Comprehensive Survey on Model Compression and Acceleration

Alternative Title

Author

Abstract

Recommended Citation

Department(s)

Research Center/Lab(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations