Home

Awesome

TagAlign - Official Pytorch Implementation

TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification <br> Qinying Liu, Kecheng Zheng, Wei Wu, Zhan Tong, Yu Liu, Wei Chen, Zilei Wang, Yujun Shen<br>

Arxiv

PWC PWC PWC PWC PWC PWC PWC PWC

<div align="center"> <img src="figs/pipeline.png" width="100%"> </div>

šŸ“œ News

[2023/12/25] The paper and project page are released!

šŸ’” Highlights

šŸ‘Øā€šŸ’» Todo

šŸ› ļø Usage

Installation

Data Preparation

For the training phase, we utilize the CC12M dataset. Researchers can procure the CC12M dataset either directly from its source or by employing the img2dataset tool. The dataset should adhere to the following file structure:

CC12M
ā”œā”€ā”€ 000002a0c848e78c7b9d53584e2d36ab0ac14785.jpg
ā”œā”€ā”€ 000002ca5e5eab763d95fa8ac0df7a11f24519e5.jpg
ā”œā”€ā”€ 00000440ca9fe337152041e26c37f619ec4c55b2.jpg
...

In addition, we provide the captions of the images in meta_file(TODO).

For evaluation, refer to the GroupVit to properly prepare the datasets. Make sure to update the image directories in 'segmentation/configs/base/datasets/*.py' as necessary.

Train and Evaluate

  1. Modify the 'tagalign.yml'. We provide the processed tag_file(TODO) and label_file(TODO).

  2. Train the TagAlign model by run

    torchrun --rdzv_endpoint=localhost:6000 --nproc_per_node=auto main.py --cfg configs/tagalign.yml
    
  3. You can evaluate the TagAlign model by running the command below.

    torchrun --rdzv_endpoint=localhost:6000 --nproc_per_node=auto main.py --cfg configs/eval.yml --eval --resume $WEIGHT
    

    $WEIGHT is the path of the pre-trained checkpoints. We provide our pre-trained weights in weights(TODO).

āœ’ļø Citation

If you find our work to be useful for your research, please consider citing.

@article{liu2023tagalign,
  title={TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification},
  author={Liu, Qinying and Zheng, Kecheng and Wei, Wu and Tong, Zhan and Liu, Yu and Chen, Wei and Wang, Zilei and Shen, Yujun},
  journal={arXiv preprint arXiv:2312.14149},
  year={2023}
}

ā¤ļø Acknowledgements