Awesome
SogCLR
This is the official implementation of the paper "Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance". Our algorithm can train self-supervised models with smaller batch sizes. The code can be run on TPUs or GPUs.
Requirements
tensorflow==2.7.0
tensorflow-datasets
Datasets
ImageNet-S is a subset of ImageNet-1K with random selected 100 classes from original 1000 classes. You could follow the instruction here to convert dataset to tfrecord. To run SogCLR, we need image IDs to track the running statistics and each tfrecord should contain the following features: image, label, id, such as:
features=tfds.features.FeaturesDict({
'image/encoded': tfds.features.Image(encoding_format='jpeg'),
'image/class/label': tfds.features.ClassLabel(names_file=names_file),
'image/ID': tf.int64, # eg: 0,1,2,3,4,...,N
})
Copy the provided /code/imagenet.py
to your local directory under tensorflow_datasets
:
cp imagenet.py /usr/local/lib/python3.7/dist-packages/tensorflow_datasets/image_classification/imagenet.py
Specify the num_classes
and data_dir
:
- ImageNet-S: --num_classes=100 --data_dir=gs://<path-to-tensorflow-dataset>
- ImageNet-1K: --num_classes=1000 --data_dir=gs://<path-to-tensorflow-dataset>
Pretraining
To pretrain the ResNet50 on ImageNet-1K using SogCLR + Dynamic Contrastive Loss with TPUs, you could set GCL_mode=True
and gamma=0.9
and then run the following command:
python run.py --train_mode=pretrain \
--train_batch_size=512 --train_epochs=800 --temperature=0.1 \
--learning_rate=0.075 --learning_rate_scaling=sqrt --weight_decay=1e-6 \
--dataset=imagenet2012 --image_size=224 --eval_split=validation --num_classes=1000 \
--num_proj_layers=2 \
--DCL_mode=True --gamma=0.9 \
--data_dir=gs://<path-to-tensorflow-dataset> \
--model_dir=gs://<path-to-store-checkpoints> \
--use_tpu=True
For baseline, you could set BI_mode=False
to use SimCLR. To use a 3-layer MLP projection head, you could set num_proj_layers=3
. To use larger ResNet models, you could set width_multiplier=1,2,4
and/or resnet_depth=50,101,152
. To use GPU, you could set use_tpu=False
.
Linear Evaluation
By default lineareval_while_pretraining=True
, it will train the linear classifier with a stop_gradient
operator during pretraining, which is simlar to perform linear evaluation after pretraining [Ref]. An example of command line for linear evaluation is as follow:
python run.py --mode=train_then_eval --train_mode=finetune \
--fine_tune_after_block=4 --zero_init_logits_layer=True \
--num_proj_layers=0 --ft_proj_selector=-1 \
--global_bn=False --optimizer=momentum --learning_rate=0.1 \
--learning_rate_scaling=linear --weight_decay=0 \
--train_epochs=90 --train_batch_size=4096 --warmup_epochs=0 \
--dataset=imagenet2012 --image_size=224 --eval_split=validation \
--data_dir=gs://<path-to-tensorflow-dataset> \
--model_dir=gs://<path-to-store-checkpoints> \
--checkpoint=gs://<path-to-store-checkpoint>/ckpt-xxxx' \
--use_tpu=True
Citation
If you find this repo helpful, please cite the following paper:
@inproceedings{yuan2022provable,
title={Provable stochastic optimization for global contrastive learning: Small batch does not harm performance},
author={Yuan, Zhuoning and Wu, Yuexin and Qiu, Zi-Hao and Du, Xianzhi and Zhang, Lijun and Zhou, Denny and Yang, Tianbao},
booktitle={International Conference on Machine Learning},
pages={25760--25782},
year={2022},
organization={PMLR}
}