Awesome
MetaQuant
Codes for Accepted Paper : "MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization" in NeurIPS 2019.
Camera-ready is now released here!
Poster is released here!
Slides is released here!
About MetaQuant
check meta-quantize-tutorial.ipynb
for description.
How to use it
Prepare pre-trained model
Please check here.
The following command train a ResNet20 using CIFAR10:
python train_base_model.py -m ResNet20 -d CIFAR10
Or users can use the default pretrained model provided by us. Uploaded in Results/model-dataset/model-dataset-pretrain.pth
Run MetaQuant
The following commands run MetaQuant on ResNet20 using CIFAR10 dataset with dorefa as forward quantization method and SGD as optimization.
The resulting quantized model is quantized using 1 bits: {+1, -1} for all layers (conv, fc).
Initial learning rate is set as 1e-3 and decreases by a factor of 0.1 every 30 epochs: 1e-3->1e-4->1e-5:
CUDA_VISIBLE_DEVICES='0' python meta-quantize.py -m ResNet20 -d CIFAR10 -q dorefa -bw 1 -o SGD -meta MultiFC -hidden 100 -ad 30
Experiments
Model | Dataset | Forward | Backward | Optimizer | Best Acc | Last 10 Acc | FP Acc |
---|---|---|---|---|---|---|---|
ResNet20 | CIFAR10 | dorefa | STE | SGD | 82.704(0.044) | 80.745(2.113) | 91.5 |
MultiFC | 89.666(0.06) | 88.942(0.466) | |||||
FC-Grad | 89.618(0.007) | 88.840(0.291) | |||||
LSTMFC | 89.278(0.09) | 88.305(0.81) | |||||
STE | Adam | 90.222(0.014) | 89.782(0.172) | ||||
MultiFC | 90.308(0.012) | 89.941(0.068) | |||||
FC-Grad | 90.407(0.02) | 89.979(0.103) | |||||
LSTMFC | 90.310(0.011) | 89.962(0.068) | |||||
BWN | STE | SGD | 78.550(0.035) | 75.913(3.495) | |||
FC-Grad | 89.500(0.008) | 88.949(0.231) | |||||
LSTMFC | 89.890(0.018) | 89.289(0.212) | |||||
STE | Adam | 90.470(0.004) | 89.896(0.182) | ||||
FC-Grad | 90.426(0.042) | 90.036(0.109) | |||||
LSTMFC | 90.494(0.030) | 90.042(0.098) |
Best accuracy refers to the best test accuracy recorded during training, mean and variance (in parentheses) for 5 times experiments are reported.
Last 10 acc refers to the test accuracy in the last 10 epochs for 5 times experiments
Raw experimental data can be found in Results/model-dataset/runs-Quant-Multiple
Comparison using dorefa as forward and SGD as optimizer
Comparison using dorefa as forward and Adam as optimizer
Comparison using BWN as forward and SGD as optimizer
Comparison using BWN as forward and Adam as optimizer
Shadow around line is the max/min value during the multiple trial experiments.
Required
pytorch > 0.4
Customization
For using MetaQuant in your own model, the following steps are required:
- Replace all quantized layer with
meta_utils.meta_quantized_module.MetaConv
andmeta_utils.meta_quantized_module.Linear
- Construct
layer_name_list
as inmodels.quantized_meta_resnet.py
, which is used to access layer in MetaQuant. - Write your code as in
meta-quantize.py
Support
Leave an issue if there is any bug and email me if any concerns about paper.
Citation
Cite the paper if anything helps you:
@article{chen2019metaquant,
title={MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization},
author={Chen, Shangyu and Wang, Wenya and Pan, Sinno Jialin},
journal={Conference on Neural Information Processing Systems},
year={2019}
}