Home

Awesome

UniGrad

By Chenxin Tao, Honghui Wang, Xizhou Zhu, Jiahua Dong, Shiji Song, Gao Huang, Jifeng Dai

This is the official implementation of the CVPR 2022 paper Exploring the Equivalence of Siamese Self-Supervised Learning via A Unified Gradient Framework.

News

Introduction

We propose a unified framework for different self-supervised learning methods through the perspective of gradient analysis. It turns out that previous methods share similar gradient structures (positive gradient + negative gradient) as well as similar performances. Based on these analysis, we propose UniGrad as a concise and effective method for self-supervised learning.

UniGrad-intro

MethodLoss formula
contrastive learning$L=\mathbb{E}_{u_1, u_2}\bigg[-\log\frac{\exp{(\mathrm{cos}(u_1^o, u_2^t)/\tau)}}{\sum_{v^t}\exp{(\mathrm{cos}(u_1^o, v^t)/\tau)}}\bigg]$
asymmetric networks$L = \mathbb{E}_{u_1,u_2}\bigg[-\mathrm{cos}(h(u_1^o), u_2^t)\bigg]$
feature decorrelationBarlow Twins: $L = \sum_{i}{(W_{ii}-1)^2+\lambda\sum_{i}{\sum_{j\ne{i}}}{W_{ij}^2}}$ <br> VICReg: $L = \frac{1}{N}\sum_{v_1^o, v_2^s}||v_1^o-v_2^s||_2^2 + \frac{\lambda_1}{c}\sum_{i,j\ne i}W_{ij}'^2 + \frac{\lambda_2}{c}\sum_{i}\max(0, \gamma - \mathrm{std}(v_1^o)_i)$
UniGrad$L=\mathbb{E}_{u_1, u_2}\bigg[-\cos(u_1^o, u_2^m) + \frac{\lambda}{2} \sum_{v_o} \rho_v\cos^2(u_1^o, v^o)\bigg]$
MethodGradient formula
contrastive learning$\frac{\partial L}{\partial u_1^o} = -u_2^m + \lambda \sum_vs_vv,$<br> where $s_v$ is the softmax results over similarities.
asymmetric networks$\frac{\partial L}{\partial u_1^o} = -u_2^m + \lambda \sum_v\rho_v(u_1^Tv)v$
feature decorrelation$\frac{\partial L}{\partial u_1^o} = -u_2^m + \lambda \sum_v\frac{u_2^Tv_2}{N}v_1$
UniGrad$\frac{\partial L}{\partial u_1^o} = -u_2^m + \lambda Fu_1^o,$<br> where $F$ is the moving average of correlation matrix.

Main Results and Pretrained models

LossModelEpochOri. Top1 acc.Grad. imple.<br> w/o mm enc.Grad. imple.<br> w/ mm enc.configPretrained model
Contrastive Learning
MoCoR5010067.467.670.0--
SimCLRR5010062.767.670.0config-
Asymmetric Networks
SimSiamR5010068.167.970.2--
BYOLR5010066.567.970.2configmodel
Feature Decorrelation
VICRegR5010068.667.669.8--
Barlow TwinsR5010068.767.670.0configmodel
UniGrad
UniGradR50100--70.3configmodel
UniGrad+cutmixR50800--74.9--

Note:

(1) The numbers for original versions come from original papers, and all gradient versions are implemented by us;

(2) For MoCo, SimCLR and BYOL, our results improve upon their original performances by a large margin. The improvements mainly come from a stronger projector (3 layers with 2048 hidden dimension). Similarly, original Barlow Twins and VICReg get better results because they use a 3-layer MLP with 8192 hidden dimension for the projector.

Usage

Preparation

The pretraining only requires installing pytorch and downloading ImageNet dataset, whose detailed instructions can be found here.

To conduct the linear evaluation, apex should be installed because LARS optimizer is used.

Pretraining

To conduct pretraining, fill the dataset path into the config, and run the following command:

./run_pretrain.sh {CONFIG_NAME} {NUM_GPUS}

For example, the command for pretraining with UniGrad on 8 GPUs is as follows:

./run_pretrain.sh ./configs/unigrad.py 8

With 8 gpus, a 100-ep experiment usually requires 38 hours to finish, which is also reported in the paper.

Linear evaluation

To conduct linear evaluation, run the following command:

./run_linear.sh {PRETRAINED_CKPT} {DATA_PATH} {NUM_GPUS}

Note that for 100-ep pretrained model, the base learning rate is set to 0.1, which just follows the evaluation setting of SimSiam. With our provided pretrained model, this command should give ~70 top-1 accuracy.

Todo

License

This project is released under the Apache 2.0 license.

Citing UniGrad

If you find UniGrad useful in your research, please consider citing:

@inproceedings{tao2022exploring,
  title={Exploring the equivalence of siamese self-supervised learning via a unified gradient framework},
  author={Tao, Chenxin and Wang, Honghui and Zhu, Xizhou and Dong, Jiahua and Song, Shiji and Huang, Gao and Dai, Jifeng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14431--14440},
  year={2022}
}