Home

Awesome

Asymmetric Patch Sampling for Contrastive Learning

PyTorch implementation and pre-trained models for paper APS: Asymmetric Patch Sampling for Contrastive Learning.

<p align="center"><img src="./images/motivation.png" width="50%" /> </p>

APS is a novel asymmetric patch sampling strategy for contrastive learning, to further boost the appearance asymmetry for better representations. APS significantly outperforms the existing self-supervised methods on both ImageNet-1K and CIFAR dataset, e.g., 2.5% finetune accuracy improvement on CIFAR100. Additionally, compared to other self-supervised methods, APS is more efficient on both memory and computation during training.

<img src="https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com/visresearch/aps&count_bg=%23126DE4&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=hits&edge_flat=false"/>

Requirements


conda create -n asp python=3.9
pip install -r requirements.txt

Datasets


Torchvision provides CIFAR10, CIFAR100 datasets. The root paths of data are respectively set to ./dataset/cifar10 and ./dataset/cifar100. ImageNet-1K dataset is placed at ./dataset/ILSVRC.

Pre-training


To start the APS pre-training, simply run the following commands.

• Arguments

Run APS with ViT-Small/2 network on a single node on CIFAR100 for 1600 epochs with the following command.

python main_pretrain.py --arch='vit-small' --dataset='cifar100' --data-root='./dataset/cifar100' --nepoch=1600

Finetuning


To finetune ViT-Small/2 on CIFAR100 with the following command.

python main_finetune.py --arch='vit-small' --dataset='cifar100' --data-root='./dataset/cifar100'  \
                   --pretrained-weights='./weight/pretrain/cifar100/small_1600ep_5e-4_100.pth'

Trained Model Weights & Finetune Accuracy


DatasetTraining (#Epochs)ViT-Tiny/2ViT-Small/2ViT-Base/2
CIFAR10Pretrain (1600)downloaddownloaddownload
Finetune (100)downloaddownloaddownload
Accuracy97.2%98.1%98.2%
Pretrain (3200)downloaddownloaddownload
Finetune (100)downloaddownloaddownload
Accuracy97.5%98.2%98.3%
CIFAR100Pretrain (1600)downloaddownloaddownload
Finetune (100)downloaddownloaddownload
Accuracy83.4%84.9%85.9%
Pretrain (3200)downloaddownloaddownload
Finetune (100)downloaddownloaddownload
Accuracy83.4%85.3%86.0%
BackbonePretrain (300 epochs)Finetune (100 epochs)
ViT-S/16download82.1% (download)
ViT-B/16download84.2% (download)

LICENSE


This project is under the CC-BY-NC 4.0 license. See LICENSE for details.