Home

Awesome

involution

Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVPR'21)

By Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, and Qifeng Chen

<p align="center"><img src="fig/involution.png" width="500" /></p>

TL; DR. involution is a general-purpose neural primitive that is versatile for a spectrum of deep learning models on different vision tasks. involution bridges convolution and self-attention in design, while being more efficient and effective than convolution, simpler than self-attention in form.

<p align="center"><img src="fig/complexity.png" width="400" /><img src="fig/parameter.png" width="400" /></p>

If you find our work useful in your research, please cite:

@InProceedings{Li_2021_CVPR,
    author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
    title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2021}
}

Getting Started

This repository is fully built upon the OpenMMLab toolkits. For each individual task, the config and model files follow the same directory organization as mmcls, mmdet, and mmseg respectively, so just copy-and-paste them to the corresponding locations to get started.

For example, in terms of evaluating detectors

git clone https://github.com/open-mmlab/mmdetection # and install

# copy model files
cp det/mmdet/models/backbones/* mmdetection/mmdet/models/backbones
cp det/mmdet/models/necks/* mmdetection/mmdet/models/necks
cp det/mmdet/models/dense_heads/* mmdetection/mmdet/models/dense_heads
cp det/mmdet/models/roi_heads/* mmdetection/mmdet/models/roi_heads
cp det/mmdet/models/roi_heads/mask_heads/* mmdetection/mmdet/models/roi_heads/mask_heads
cp det/mmdet/models/utils/* mmdetection/mmdet/models/utils
cp det/mmdet/datasets/* mmdetection/mmdet/datasets

# copy config files
cp det/configs/_base_/models/* mmdetection/configs/_base_/models
cp det/configs/_base_/schedules/* mmdetection/configs/_base_/schedules
cp det/configs/involution mmdetection/configs -r

# evaluate checkpoints
cd mmdetection
bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

For more detailed guidance, please refer to the original mmcls, mmdet, and mmseg tutorials.

Currently, we provide an memory-efficient implementation of the involuton operator based on CuPy. Please install this library in advance. A customized CUDA kernel would bring about further acceleration on the hardware. Any contribution from the community regarding this is welcomed!

Model Zoo

The parameters/FLOPs↓ and performance↑ compared to the convolution baselines are marked in the parentheses. Part of these checkpoints are obtained in our reimplementation runs, whose performance may show slight differences with those reported in our paper. Models are trained with 64 GPUs on ImageNet, 8 GPUs on COCO, and 4 GPUs on Cityscapes.

Image Classification on ImageNet

ModelParams(M)FLOPs(G)Top-1 (%)Top-5 (%)ConfigDownload
RedNet-269.23<sub>(32.8%↓)</sub>1.73<sub>(29.2%↓)</sub>75.9693.19configmodel | log
RedNet-3812.39<sub>(36.7%↓)</sub>2.22<sub>(31.3%↓)</sub>77.4893.57configmodel | log
RedNet-5015.54<sub>(39.5%↓)</sub>2.71<sub>(34.1%↓)</sub>78.3594.13configmodel | log
RedNet-10125.65<sub>(42.6%↓)</sub>4.74<sub>(40.5%↓)</sub>78.9294.35configmodel | log
RedNet-15233.99<sub>(43.5%↓)</sub>6.79<sub>(41.4%↓)</sub>79.1294.38configmodel | log

Before finetuning on the following downstream tasks, download the ImageNet pre-trained RedNet-50 weights and set the pretrained argument in det/configs/_base_/models/*.py or seg/configs/_base_/models/*.py to your local path.

Object Detection and Instance Segmentation on COCO

Faster R-CNN

BackboneNeckHeadStyleLr schdParams(M)FLOPs(G)box APConfigDownload
RedNet-50-FPNconvolutionconvolutionpytorch1x31.6<sub>(23.9%↓)</sub>177.9<sub>(14.1%↓)</sub>39.5<sub>(1.8↑)</sub>configmodel | log
RedNet-50-FPNinvolutionconvolutionpytorch1x29.5<sub>(28.9%↓)</sub>135.0<sub>(34.8%↓)</sub>40.2<sub>(2.5↑)</sub>configmodel | log
RedNet-50-FPNinvolutioninvolutionpytorch1x29.0<sub>(30.1%↓)</sub>91.5<sub>(55.8%↓)</sub>39.2<sub>(1.5↑)</sub>configmodel | log

Mask R-CNN

BackboneNeckHeadStyleLr schdParams(M)FLOPs(G)box APmask APConfigDownload
RedNet-50-FPNconvolutionconvolutionpytorch1x34.2<sub>(22.6%↓)</sub>224.2<sub>(11.5%↓)</sub>39.9<sub>(1.5↑)</sub>35.7<sub>(0.6↑)</sub>configmodel | log
RedNet-50-FPNinvolutionconvolutionpytorch1x32.2<sub>(27.1%↓)</sub>181.3<sub>(28.5%↓)</sub>40.8<sub>(2.4↑)</sub>36.4<sub>(1.3↑)</sub>configmodel | log
RedNet-50-FPNinvolutioninvolutionpytorch1x29.5<sub>(33.3%↓)</sub>104.6<sub>(58.7%↓)</sub>39.6<sub>(1.2↑)</sub>35.1<sub>(0.0↑)</sub>configmodel | log

RetinaNet

BackboneNeckStyleLr schdParams(M)FLOPs(G)box APConfigDownload
RedNet-50-FPNconvolutionpytorch1x27.8<sub>(26.3%↓)</sub>210.1<sub>(12.2%↓)</sub>38.2<sub>(1.6↑)</sub>configmodel | log
RedNet-50-FPNinvolutionpytorch1x26.3<sub>(30.2%↓)</sub>199.9<sub>(16.5%↓)</sub>38.2<sub>(1.6↑)</sub>configmodel | log

Semantic Segmentation on Cityscapes

MethodBackboneNeckCrop SizeLr schdParams(M)FLOPs(G)mIoUConfigdownload
FPNRedNet-50convolution512x10248000018.5<sub>(35.1%↓)</sub>293.9<sub>(19.0%↓)</sub>78.0<sub>(3.6↑)</sub>configmodel | log
FPNRedNet-50involution512x10248000016.4<sub>(42.5%↓)</sub>205.2<sub>(43.4%↓)</sub>79.1<sub>(4.7↑)</sub>configmodel | log
UPerNetRedNet-50convolution512x10248000056.4<sub>(15.1%↓)</sub>1825.6<sub>(3.6%↓)</sub>80.6<sub>(2.4↑)</sub>configmodel | log