Home

Awesome

SA2VP: Spatially Aligned-and-Adapted Visual Prompt

paper link: https://arxiv.org/abs/2312.10376


This repository contains the official PyTorch implementation for SA2VP.

model_img

Environment settings

We use the framework from https://github.com/microsoft/unilm/tree/master/beit

we use following datasets for evaluation:

https://github.com/KMnP/vpt (FGVC)

https://github.com/dongzelian/SSF (VTAB-1k)

https://github.com/shikiw/DAM-VP (HTA)

This code is tested with Python-3.7.13, Pytorch = 1.12.1 and CUDA = 11.4, requiring the following dependencies:

we also provide the requirement.txt for reference.

Structure of this repo

│SA2VP/
├──data/
│   ├──fgvc/
│   │   ├──CUB_200_2011/
│   │   ├──OxfordFlower/
│   │   ├──Stanford-cars/
│   │   ├──Stanford-dogs/
│   │   ├──nabirds/
│   ├──vtab-1k/
│   │   ├──caltech101/
│   │   ├──cifar/
│   │   ├──.......
├──backbone_ckpt/
│   ├──imagenet21k_ViT-B_16.npz
│   ├──swin_base_patch4_window7_224_22k.pth

Experiment steps

CUBNabirdsFlowerDOGCAR
inter-dim163283264
inter-weight0.10.10.10.11.5
batch size64/12864/12864/12864/12864/128
vtab-Naturalvtab-Specialvtab-StructureHTA
inter-dim8163264
inter-weight0.11.51.50.1
batch size40/64404064/128
vtab-Naturalvtab-Specialvtab-Structure
inter-dim888
inter-weight0.1/0.50.5/1.51.5
batch size40/644040

Citation

If you find our work helpful in your research, please cite it as:

@inproceedings{pei2024sa2vp,
  title={SA^2VP: Spatially Aligned-and-Adapted Visual Prompt},
  author={Pei, Wenjie and Xia, Tongqi and Chen, Fanglin and Li, Jinsong and Tian, Jiandong and Lu, Guangming},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2024}
}

License

The code is released under MIT License (see LICENSE file for details).