Home

Awesome

Task Residual: Tuning Vision-Language Models in One Line of Code

The official implementation of Task Residual for Tuning Vision-Language Models (accepted to CVPR 2023).

The proposed Task Residual Tuning (TaskRes) is a new paradigm for tuning vision-language models (VLMs), which directly tunes the text-based classifier weights, without the need of heavy text encoders for prompt updates or carefully designed adapters.

Comparison

image

Installation

This repository requires to install the environment and datasets:

PS: You can also follow CoOp to perform the installation.

Usage

We present the basic usage here.

(a) Train regular TaskRes:

(b) Train enhanced TaskRes:

(c) Test domain generalization:

PS: Refer to CoOp for more usage.

Acknowledgment

This repository is mainly based on Kaiyang Zhou's repository CoOp code base. We sincerely thank Kaiyang for his awesome code base.

Citation

If you find this work useful for your research, please cite us:

@inproceedings{yu2023task,
  title={Task Residual for Tuning Vision-Language Models},
  author={Yu, Tao and Lu, Zhihe and Jin, Xin and Chen, Zhibo and Wang, Xinchao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={10899--10909},
  year={2023}
}