Command Palette
Search for a command to run...
Liyang Liu Shilong Zhang Zhanghui Kuang Aojun Zhou Jing-Hao Xue Xinjiang Wang Yimin Chen Wenming Yang Qingmin Liao Wayne Zhang

Abstract
Network compression has been widely studied since it is able to reduce the memory and computation cost during inference. However, previous methods seldom deal with complicated structures like residual connections, group/depth-wise convolution and feature pyramid network, where channels of multiple layers are coupled and need to be pruned simultaneously. In this paper, we present a general channel pruning approach that can be applied to various complicated structures. Particularly, we propose a layer grouping algorithm to find coupled channels automatically. Then we derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels. Moreover, we find that inference speedup on GPUs is more correlated with the reduction of memory rather than FLOPs, and thus we employ the memory reduction of each channel to normalize the importance. Our method can be used to prune any structures including those with coupled channels. We conduct extensive experiments on various backbones, including the classic ResNet and ResNeXt, mobile-friendly MobileNetV2, and the NAS-based RegNet, both on image classification and object detection which is under-explored. Experimental results validate that our method can effectively prune sophisticated networks, boosting inference speed without sacrificing accuracy.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| network-pruning-on-imagenet | RegX-1.6G | Accuracy: 77.97 GFLOPs: 1.588 MParams: 9.3 |
| network-pruning-on-imagenet | MobileNetV2 | Accuracy: 73.42 GFLOPs: 0.29 MParams: 3.31 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.