Home

Awesome

<p align="center"> <img src="logo.png" width="250"/> </p> <div align="center">

tests Documentation Status codecov

</div>

solo-learn

A library of self-supervised methods for unsupervised visual representation learning powered by PyTorch Lightning. We aim at providing SOTA self-supervised methods in a comparable environment while, at the same time, implementing training tricks. The library is self-contained, but it is possible to use the models outside of solo-learn. More details in our paper.


News


Roadmap and help needed


Methods available


Extra flavor

Backbones

Data

Evaluation

Training tricks

Logging


Requirements

Optional:


Installation

First clone the repo.

Then, to install solo-learn with Dali and/or UMAP support, use:

pip3 install .[dali,umap,h5] --extra-index-url https://developer.download.nvidia.com/compute/redist

If no Dali/UMAP/H5 support is needed, the repository can be installed as:

pip3 install .

For local development:

pip3 install -e .[umap,h5]
# Make sure you have pre-commit hooks installed
pre-commit install

NOTE: if you are having trouble with dali, install it following their guide.

NOTE 2: consider installing Pillow-SIMD for better loading times when not using Dali.

NOTE 3: Soon to be on pip.


Training

For pretraining the backbone, follow one of the many bash files in scripts/pretrain/. We are now using Hydra to handle the config files, so the common syntax is something like:

python3 main_pretrain.py \
    # path to training script folder
    --config-path scripts/pretrain/imagenet-100/ \
    # training config name
    --config-name barlow.yaml
    # add new arguments (e.g. those not defined in the yaml files)
    # by doing ++new_argument=VALUE
    # pytorch lightning's arguments can be added here as well.

After that, for offline linear evaluation, follow the examples in scripts/linear or scripts/finetune for finetuning the whole backbone.

For k-NN evaluation and UMAP visualization check the scripts in scripts/{knn,umap}.

NOTE: Files try to be up-to-date and follow as closely as possible the recommended parameters of each paper, but check them before running.


Tutorials

Please, check out our documentation and tutorials:

If you want to contribute to solo-learn, make sure you take a look at how to contribute and follow the code of conduct


Model Zoo

All pretrained models avaiable can be downloaded directly via the tables below or programmatically by running one of the following scripts zoo/cifar10.sh, zoo/cifar100.sh, zoo/imagenet100.sh and zoo/imagenet.sh.


Results

Note: hyperparameters may not be the best, we will be re-running the methods with lower performance eventually.

CIFAR-10

MethodBackboneEpochsDaliAcc@1Acc@5Checkpoint
All4OneResNet181000:x:93.2499.88:link:
Barlow TwinsResNet181000:x:92.1099.73:link:
BYOLResNet181000:x:92.5899.79:link:
DeepCluster V2ResNet181000:x:88.8599.58:link:
DINOResNet181000:x:89.5299.71:link:
MoCo V2+ResNet181000:x:92.9499.79:link:
MoCo V3ResNet181000:x:93.1099.80:link:
NNCLRResNet181000:x:91.8899.78:link:
ReSSLResNet181000:x:90.6399.62:link:
SimCLRResNet181000:x:90.7499.75:link:
SimsiamResNet181000:x:90.5199.72:link:
SupConResNet181000:x:93.8299.65:link:
SwAVResNet181000:x:89.1799.68:link:
VIbCRegResNet181000:x:91.1899.74:link:
VICRegResNet181000:x:92.0799.74:link:
W-MSEResNet181000:x:88.6799.68:link:

CIFAR-100

MethodBackboneEpochsDaliAcc@1Acc@5Checkpoint
All4OneResNet181000:x:72.1793.35:link:
Barlow TwinsResNet181000:x:70.9091.91:link:
BYOLResNet181000:x:70.4691.96:link:
DeepCluster V2ResNet181000:x:63.6188.09:link:
DINOResNet181000:x:66.7690.34:link:
MoCo V2+ResNet181000:x:69.8991.65:link:
MoCo V3ResNet181000:x:68.8390.57:link:
NNCLRResNet181000:x:69.6291.52:link:
ReSSLResNet181000:x:65.9289.73:link:
SimCLRResNet181000:x:65.7889.04:link:
SimsiamResNet181000:x:66.0489.62:link:
SupConResNet181000:x:70.3889.57:link:
SwAVResNet181000:x:64.8888.78:link:
VIbCRegResNet181000:x:67.3790.07:link:
VICRegResNet181000:x:68.5490.83:link:
W-MSEResNet181000:x:61.3387.26:link:

ImageNet-100

MethodBackboneEpochsDaliAcc@1 (online)Acc@1 (offline)Acc@5 (online)Acc@5 (offline)Checkpoint
All4OneResNet18400:heavy_check_mark:81.93-96.23-:link:
Barlow Twins :rocket:ResNet18400:heavy_check_mark:80.3880.1695.2895.14:link:
BYOL :rocket:ResNet18400:heavy_check_mark:80.1680.3295.0294.94:link:
DeepCluster V2ResNet18400:x:75.3675.493.2293.10:link:
DINOResNet18400:heavy_check_mark:74.8474.9292.9292.78:link:
DINO :sleepy:ViT Tiny400:x:63.04TODO87.72TODO:link:
MoCo V2+ :rocket:ResNet18400:heavy_check_mark:78.2079.2895.5095.18:link:
MoCo V3 :rocket:ResNet18400:heavy_check_mark:80.3680.3695.1894.96:link:
MoCo V3 :rocket:ResNet50400:heavy_check_mark:85.4884.5896.8296.70:link:
NNCLR :rocket:ResNet18400:heavy_check_mark:79.8080.1695.2895.30:link:
ReSSLResNet18400:heavy_check_mark:76.9278.4894.2094.24:link:
SimCLR :rocket:ResNet18400:heavy_check_mark:77.64TODO94.06TODO:link:
SimsiamResNet18400:heavy_check_mark:74.5478.7293.1694.78:link:
SupConResNet18400:heavy_check_mark:84.40TODO95.72TODO:link:
SwAVResNet18400:heavy_check_mark:74.0474.2892.7092.84:link:
VIbCRegResNet18400:heavy_check_mark:79.8679.3894.9894.60:link:
VICReg :rocket:ResNet18400:heavy_check_mark:79.2279.4095.0695.02:link:
W-MSEResNet18400:heavy_check_mark:67.6069.0690.9491.22:link:

:rocket: methods where hyperparameters were heavily tuned.

:sleepy: ViT is very compute intensive and unstable, so we are slowly running larger architectures and with a larger batch size. Atm, total batch size is 128 and we needed to use float32 precision. If you want to contribute by running it, let us know!

ImageNet

MethodBackboneEpochsDaliAcc@1 (online)Acc@1 (offline)Acc@5 (online)Acc@5 (offline)CheckpointFinetuned Checkpoint
Barlow TwinsResNet50100:heavy_check_mark:67.1867.2387.6987.98:link:
BYOLResNet50100:heavy_check_mark:68.6368.3788.8088.66:link:
MoCo V2+ResNet50100:heavy_check_mark:62.6166.8485.4087.60:link:
MAEViT-B/16100:x:~81.60 (finetuned)~95.50 (finetuned):link::link:

Training efficiency for DALI

We report the training efficiency of some methods using a ResNet18 with and without DALI (4 workers per GPU) in a server with an Intel i9-9820X and two RTX2080ti.

MethodDaliTotal time for 20 epochsTime for 1 epochGPU memory (per GPU)
Barlow Twins:x:1h 38m 27s4m 55s5097 MB
:heavy_check_mark:43m 2s2m 10s (56% faster)9292 MB
BYOL:x:1h 38m 46s4m 56s5409 MB
:heavy_check_mark:50m 33s2m 31s (49% faster)9521 MB
NNCLR:x:1h 38m 30s4m 55s5060 MB
:heavy_check_mark:42m 3s2m 6s (64% faster)9244 MB

Note: GPU memory increase doesn't scale with the model, rather it scales with the number of workers.


Citation

If you use solo-learn, please cite our paper:

@article{JMLR:v23:21-1155,
  author  = {Victor Guilherme Turrisi da Costa and Enrico Fini and Moin Nabi and Nicu Sebe and Elisa Ricci},
  title   = {solo-learn: A Library of Self-supervised Methods for Visual Representation Learning},
  journal = {Journal of Machine Learning Research},
  year    = {2022},
  volume  = {23},
  number  = {56},
  pages   = {1-6},
  url     = {http://jmlr.org/papers/v23/21-1155.html}
}