Awesome

Asymmetric Patch Sampling for Contrastive Learning

PyTorch implementation and pre-trained models for paper APS: Asymmetric Patch Sampling for Contrastive Learning.

APS is a novel asymmetric patch sampling strategy for contrastive learning, to further boost the appearance asymmetry for better representations. APS significantly outperforms the existing self-supervised methods on both ImageNet-1K and CIFAR dataset, e.g., 2.5% finetune accuracy improvement on CIFAR100. Additionally, compared to other self-supervised methods, APS is more efficient on both memory and computation during training.

[Paper] [Arxiv] [BibTex]

Requirements

conda create -n asp python=3.9
pip install -r requirements.txt

Datasets

Torchvision provides CIFAR10, CIFAR100 datasets. The root paths of data are respectively set to ./dataset/cifar10 and ./dataset/cifar100. ImageNet-1K dataset is placed at ./dataset/ILSVRC.

Pre-training

To start the APS pre-training, simply run the following commands.

• Arguments

arch is the architecture of the pre-trained models，you can choose vit-tiny, vit-small and vit-base.
dataset is the pre-trained dataset.
data-root is the path of the dataset.
nepoch is the pre-trained epochs.

Run APS with ViT-Small/2 network on a single node on CIFAR100 for 1600 epochs with the following command.

python main_pretrain.py --arch='vit-small' --dataset='cifar100' --data-root='./dataset/cifar100' --nepoch=1600

Finetuning

To finetune ViT-Small/2 on CIFAR100 with the following command.

python main_finetune.py --arch='vit-small' --dataset='cifar100' --data-root='./dataset/cifar100'  \
                   --pretrained-weights='./weight/pretrain/cifar100/small_1600ep_5e-4_100.pth'

Trained Model Weights & Finetune Accuracy

CIFAR10 and CIFAR100

Dataset	Training (#Epochs)	ViT-Tiny/2	ViT-Small/2	ViT-Base/2
CIFAR10	Pretrain (1600)	download	download	download
	Finetune (100)	download	download	download
	Accuracy	97.2%	98.1%	98.2%
	Pretrain (3200)	download	download	download
	Finetune (100)	download	download	download
	Accuracy	97.5%	98.2%	98.3%
CIFAR100	Pretrain (1600)	download	download	download
	Finetune (100)	download	download	download
	Accuracy	83.4%	84.9%	85.9%
	Pretrain (3200)	download	download	download
	Finetune (100)	download	download	download
	Accuracy	83.4%	85.3%	86.0%

ImageNet-1K

Backbone	Pretrain (300 epochs)	Finetune (100 epochs)
ViT-S/16	download	82.1% (download)
ViT-B/16	download	84.2% (download)

LICENSE

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

Citation

@article{shen2025asymmetric,
      title={Asymmetric Patch Sampling for Contrastive Learning}, 
      author={Shen, Chengchao and Chen, Jianzhong and Wang, Shu and Kuang, Hulin and Liu, Jin and Wang, Jianxin},
      journal={Pattern Recognition},
      year={2025}
}