Home

Awesome

1xN Pattern for Pruning Convolutional Neural Networks (paper) .

Pytorch implementation of our paper accepted by TPAMI 2022 -- "1xN Pattern for Pruning Convolutional Neural Networks".

1) 1×N Block Pruning

<div align=center><img src="https://raw.githubusercontent.com/lmbxmu/1xN/master/images/comparison.jpg" height = "60%" width = "70%"/></div>

Requirements

Code Running

To reproduce our experiments, please use the following command:

python imagenet.py \
--gpus 0 \
--arch mobilenet_v1 (or mobilenet_v2 or mobilenet_v3_large or mobilenet_v3_small) \
--job_dir ./experiment/ \
--data_path [DATA_PATH] \
--pretrained_model [PRETRAIN_MODEL_PATH] \
--pr_target 0.5 \
--N 4 (or 2, 8, 16, 32) \
--conv_type BlockL1Conv \
--train_batch_size 256 \
--eval_batch_size 256 \
--rearrange \

The pre-trained models can be downloaded at MobileNet-V1, MobileNet-V2, MobileNet-V3-Large, MobileNet-V3-Small and ResNet-50.

Accuracy Performance

Table 1: Performance comparison of our 1×N block sparsity against weight pruning and filter pruning (p = 50%).

MobileNet-V1Top-1 Acc.Top-5 Acc.Model Link
Weight Pruning70.76489.592Pruned Model
Filter Pruning65.34886.264Pruned Model
1 x 2 Block70.28189.370Pruned Model
1 x 4 Block70.05289.056Pruned Model
1 x 8 Block69.90889.027Pruned Model
1 x 16 Block69.55988.933Pruned Model
1 x 32 Block69.54188.801Pruned Model
MobileNet-V2Top-1 Acc.Top-5 Acc.Model Link
Weight Pruning71.14689.872Pruned Model
Filter Pruning66.73087.190Pruned Model
1 x 2 Block70.23389.417Pruned Model
1 x 4 Block60.70689.165Pruned Model
1 x 8 Block69.37288.862Pruned Model
1 x 16 Block69.35288.708Pruned Model
1 x 32 Block68.76288.425Pruned Model
MobileNet-V3-smallTop-1 Acc.Top-5 Acc.Model Link
Weight Pruning66.37686.868Pruned Model
Filter Pruning59.05481.713Pruned Model
1 x 2 Block65.38086.060Pruned Model
1 x 4 Block64.46585.495Pruned Model
1 x 8 Block64.10185.274Pruned Model
1 x 16 Block63.12684.203Pruned Model
1 x 32 Block62.88183.982Pruned Model
MobileNet-V3-largeTop-1 Acc.Top-5 Acc.Model Link
Weight Pruning72.89791.093Pruned Model
Filter Pruning69.13789.097Pruned Model
1 x 2 Block72.12090.677Pruned Model
1 x 4 Block71.93590.458Pruned Model
1 x 8 Block71.47890.163Pruned Model
1 x 16 Block71.11290.129Pruned Model
1 x 32 Block70.76989.696Pruned Model
<div align=center><img src="https://github.com/lmbxmu/1xN/blob/master/images/rates.jpg" height = "60%" width = "70%"/></div>

Besides, we provide the raw data for plotting the above figures in ./raw_data_fig4. For example, run python ./raw_data_fig4/resnet50_top1.py to plot top-1 accuracy of ResNet-50 pruned by different methods.

More links for pruned models under different pruning rates and their training logs can be found in MobileNet-V2 and ResNet-50.

Table 2: Performance studies of our 1×N pruning with kernel-wise pruning.

ResNet-50Top-1 Acc.Top-5 Acc.Model Link
1x4 Block76.50693.239Pruned Model
kernel (random)74.83492.178Pruned Model
kernel ($\ell_1$)75.37092.582Pruned Model

Evaluate our models

To verify the performance of our pruned models, download our pruned models from the links provided above and run the following command:

python imagenet.py \
--gpus 0 \
--arch mobilenet_v1 (or mobilenet_v2 or mobilenet_v3_large or mobilenet_v3_small) \
--data_path [DATA_PATH] \
--conv_type DenseConv \
--evaluate [PRUNED_MODEL_PATH] \
--eval_batch_size 256 \

Arguments

optional arguments:
  -h, --help            show this help message and exit
  --gpus                Select gpu_id to use. default:[0]
  --data_path           The dictionary where the data is stored.
  --job_dir             The directory where the summaries will be stored.
  --resume              Load the model from the specified checkpoint.
  --pretrain_model      Path of the pre-trained model.
  --pruned_model        Path of the pruned model to evaluate.
  --arch                Architecture of model. For ImageNet :mobilenet_v1, mobilenet_v2, mobilenet_v3_small, mobilenet_v3_large
  --num_epochs          The num of epochs to train. default:180
  --train_batch_size    Batch size for training. default:256
  --eval_batch_size     Batch size for validation. default:100
  --momentum            Momentum for Momentum Optimizer. default:0.9
  --lr LR               Learning rate. default:1e-2
  --lr_decay_step       The iterval of learn rate decay for cifar. default:100 150
  --lr_decay_freq       The frequecy of learn rate decay for Imagenet. default:30
  --weight_decay        The weight decay of loss. default:4e-5
  --lr_type             lr scheduler. default: cos. optional:exp/cos/step/fixed
  --use_dali            If this parameter exists, use dali module to load ImageNet data (benefit in training acceleration).
  --conv_type           Importance criterion of filters. Default: BlockL1Conv. optional: BlockRandomConv, DenseConv
  --pr_target           Pruning rate. default:0.5
  --full                If this parameter exists, prune fully-connected layer.
  --N                   Consecutive N kernels for removal (see paper for details).
  --rearrange           If this parameter exists, filters will be rearranged (see paper for details).
  --export_onnx         If this parameter exists, export onnx model.

2)Filter Rearrangement

<div align=center><img src="https://github.com/lmbxmu/1xN/blob/master/images/rearrangement.jpg" height = "60%" width = "70%"/></div>

Table 2: Performance studies of our 1×N block sparsity with and without filter rearrangement (p=50%).

N = 2Top-1 Acc.Top-5 Acc.Model Link
w/o Rearange69.90089.296Pruned Model
Rearrange70.23389.417Pruned Model
N = 4Top-1 Acc.Top-5 Acc.Model Link
w/o Rearange69.52188.920Pruned Model
Rearrange69.57988.944Pruned Model
N = 8Top-1 Acc.Top-5 Acc.Model Link
w/o Rearange69.20688.608Pruned Model
Rearrange69.37288.862Pruned Model
N = 16Top-1 Acc.Top-5 Acc.Model Link
w/o Rearange68.97188.399Pruned Model
Rearrange69.35288.708Pruned Model
N = 32Top-1 Acc.Top-5 Acc.Model Link
w/o Rearange68.43188.315Pruned Model
Rearrange68.76288.425Pruned Model

3)Encoding and Decoding Efficiency

<div align=center><img src="https://raw.githubusercontent.com/lmbxmu/1xN/master/images/sparse.jpg" height = "60%" width = "70%"/></div>

Performance and latency comparison

<div align=center><img src="https://github.com/lmbxmu/1xN/blob/master/images/acceleration.jpg" height = "60%" width = "70%"/></div>

Our sparse convolution implementation has been released to TVM community.

To verify the performance of our pruned models, convert onnx model and run the following command:

python model_tune.py \
--onnx_path [ONNX_MODEL_PATH] \
--bsr 4 \
--bsc 1 \
--sparsity 0.5

The detail tuning setting is referred to TVM.

4)Contact

Any problem regarding this code re-implementation, please contact the first author: lmbxmu@stu.xmu.edu.cn or the second author: yuxinzhang@stu.xmu.edu.cn.

Any problem regarding the sparse convolution implementation, please contact the third author: xiamenlyc@gmail.com.