Home

Awesome

BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning

We provide the official PyTorch Implementation of 'BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning' (CVPR 2023) <br/>

Changdae Oh, Hyeji Hwang, Hee-young Lee, YongTaek Lim, Geunyoung Jung, Jiyoung Jung, Hosik Choi, and Kyungwoo Song

<br/>

Abstract

<p align="center"> <img src="docs/fig1_illustration.png" alt= "" width="" height="250"> </p>

With the surge of large-scale pre-trained models (PTMs), fine-tuning these models to numerous downstream tasks becomes a crucial problem. Consequently, parameter efficient transfer learning (PETL) of large models has grasped huge attention. While recent PETL methods showcase impressive performance, they rely on optimistic assumptions: 1) the entire parameter set of a PTM is available, and 2) a sufficiently large memory capacity for the fine-tuning is equipped. However, in most real-world applications, PTMs are served as a black-box API or proprietary software without explicit parameter accessibility. Besides, it is hard to meet a large memory requirement for modern PTMs. In this work, we propose black-box visual prompting (BlackVIP), which efficiently adapts the PTMs without knowledge about model architectures and parameters. BlackVIP has two components; 1) Coordinator and 2) simultaneous perturbation stochastic approximation with gradient correction (SPSA-GC). The Coordinator designs input-dependent image-shaped visual prompts, which improves few-shot adaptation and robustness on distribution/location shift. SPSA-GC efficiently estimates the gradient of a target model to update Coordinator. Extensive experiments on 16 datasets demonstrate that BlackVIP enables robust adaptation to diverse domains without accessing PTMs' parameters, with minimal memory requirements.

<br/>

Research Highlights

<p align="center"> <img src="docs/blackvip_framework.png" alt= "" width="90%" height="90%"> </p> <br/> <hr/>

Coverage of this repository

Methods

Experiments

<br/>

Setup

# Clone this repo
git clone https://github.com/changdaeoh/BlackVIP.git
cd BlackVIP

# Create a conda environment
conda create -y -n blackvip python=3.8

# Activate the environment
conda activate blackvip

# Install torch and torchvision
# Please refer to https://pytorch.org/ if you need a different cuda version
conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.6 -c pytorch -c conda-forge

# Install dependencies
cd my_dassl
pip install -r requirements.txt

# Install additional requirements
cd ..
pip install -r requirements.txt
<br/>

Data preparation

<br/>

Run

transfer learning benchmarks

# for BlackVIP, specify {1:dataset, 2:epoch, 3:moms, 4:spsa_gamma, 5:spsa_c, 6:p_eps}
sh tl_bench.sh svhn 5000 0.9 0.2 0.005 1.0

# for BAR, specify {1:dataset, 2:epoch, 3:init_lr, 4:min_lr}
sh tl_bench.sh svhn 5000 5.0 0.1

# for VP w/ SPSA-GC, specify {1:dataset, 2:epoch, 3:moms, 4:spsa_a, 5:spsa_c}
sh tl_bench.sh svhn 5000 0.9 10.0 0.01

# for VP (white-box), specify {1:dataset, 2:epoch, 3:lr}
sh tl_bench.sh svhn 1000 40.0

# for Zero-shot CLIP inference, move to 'BlackVIP/scripts/coop' and run:
sh zeroshot_all.sh

synthetic datasets

# for BlackVIP on Loc-MNIST, specify {1:fake-digit-size, 2:moms, 3:spsa_alpha, 4:spsa_a, 5:spsa_c}
sh synthetic_lm.sh 1 0.9 0.5 0.01 0.005  # 1:1 setting
sh synthetic_lm.sh 4 0.95 0.5 0.02 0.01  # 1:4 seeting

# for BlackVIP on Biased MNIST, specify {1:moms, 2:spsa_alpha, 3:spsa_a, 4:spsa_c}
sh synthetic_bm_easy.sh 0.9 0.4 0.01 0.01  # spurious correlation = 0.8
sh synthetic_bm_hard.sh 0.9 0.4 0.01 0.01  # spurious correlation = 0.9

# other methods can be runned similarly to the above.

ablation study

# for BlackVIP, specify {1:target_backbone, 2:spsa_alpha, 3:moms, 4:spsa_gamma, 5:spsa_c, 6:p_eps}
sh ablation_arch_rn.sh rn50 0.5 0.9 0.2 0.01 0.3
<br/> <hr />

Contact

For any questions, discussions, and proposals, please contact to changdae.oh@uos.ac.kr or kyungwoo.song@gmail.com

<br/>

Citation

If you use our code in your research, please kindly consider citing:

@InProceedings{Oh_2023_CVPR,
    author    = {Oh, Changdae and Hwang, Hyeji and Lee, Hee-young and Lim, YongTaek and Jung, Geunyoung and Jung, Jiyoung and Choi, Hosik and Song, Kyungwoo},
    title     = {BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {24224-24235}
}
<br/>

Acknowledgements

Our overall experimental pipeline is based on CoOp, CoCoOp repository. For baseline construction, we bollowed/refered the code from repositories of VP, BAR, and AR. We appreciate the authors (Zhou et al., Bahng et al., Tsai et al.) and Savan for sharing their code.