Awesome
Vision Transformer for Contrastive Clustering
This is the code for the paper "Vision Transformer for Contrastive Clustering".
<div align=center><img src="Figures/VTCC.png"></div>Requirements
The code was trained on Ubuntu 18.04, including:
- python==3.7
- pytorch==1.7.0
- torchvision==0.8.0
- CUDA==11.0
- timm==0.5.4
- scikit-learn==1.0.1
- opencv-python==4.5.1
- pyyaml==6.0
- numpy==1.21.2
Getting Started
-
[Optional but recommended] create a new conda environment
conda create -n VTCC python=3.7
And activate the environment
conda activate VTCC
-
Clone this repository:
git clone https://github.com/JackKoLing/VTCC.git
-
Install necessary packages (other common packages installed if need):
pip install torch==1.7.0 torchvision==0.8.0 opencv-python==4.5.1 timm==0.5.4 scikit-learn==1.0.1 numpy pyyaml
Data Preparation
Eight datasets can be downloaded from the url provided by their corresponding papers or official websites.
Dataset Structure:
Make sure to put the files in the following structure:
|-- datasets
| |-- RSOD
| |-- UC-Merced
| |-- ...
Configuration
There is a configuration file "config/config.yaml", where one can edit both the training and test options.
Training
After setting the configuration, to start training, simply run
python train.py
Test
Once the training is completed, there will be a saved model in the "model_path" specified in the configuration file. To test the trained model, run
python cluster.py
Citation
If you find VTCC useful in your research, please consider citing:
@article{ling2022vision,
title={Vision Transformer for Contrastive Clustering},
author={Ling, Hua-Bao and Zhu, Bowen and Huang, Dong and Chen, Ding-Hua and Wang, Chang-Dong and Lai, Jian-Huang},
journal={arXiv preprint arXiv:2206.12925},
year={2022}
}
Acknowledge
The code is developed based on the architecture of CC and MoCoV3. We sincerely thank the authors for the excellent works!