Abstract

The performance of a deep neural network (deep NN) is dependent upon a significant number of weight parameters that need to be trained which is a computational bottleneck. The growing trend of deeper architectures poses a restriction on the training and inference scheme on resource-constrained devices. Pruning is an important method for removing the deep NN's unimportant parameters and making their deployment easier on resource-constrained devices for practical applications. In this paper, we proposed a heuristics-based novel filter pruning method to automatically identify and prune the unimportant filters and make the inference process faster on devices with limited resource availability. The selection of the unimportant filters is made by a novel pruning estimator (γ). The proposed method is tested on various convolutional architectures AlexNet, VGG16, ResNet34, and datasets CIFAR10, CIFAR100, and ImageNet. The experimental results on a large-scale ImageNet dataset show that the FLOPs of the VGG16 can be reduced up to 77.47%, achieving ≈5x inference speedup. The FLOPs of a more popular ResNet34 model are reduced by 41.94% while retaining competitive performance compared to other state-of-the-art methods.

Department(s)

Electrical and Computer Engineering

Keywords and Phrases

Convolutional neural network; Deep neural network; Efficient inference; Filter pruning; Model compression and acceleration

International Standard Serial Number (ISSN)

1433-3058; 0941-0643

Document Type

Article - Journal

Document Version

Final Version

File Type

text

Language(s)

English

Rights

© 2023 Springer, All rights reserved.

Publication Date

01 Mar 2022

Share

 
COinS