

Generalized Category Discovery

This repo contains code for our paper: Generalized Category Discovery

Given a dataset, some of which is labelled, Generalized Category Discovery is the task of assigning a category to all the unlabelled instances. Unlabelled instances could come from labelled or 'New' classes.



Updates to paper since pre-print (updated PDF available here, ArXiv updating soon)

Running


pip install -r requirements.txt


Set paths to datasets, pre-trained models and desired log directories in config.py

Set SAVE_DIR (logfile destination) and PYTHON (path to python interpreter) in bash_scripts scripts.


We use fine-grained benchmarks in this paper, including:

We also use generic object recognition datasets, including:


Train representation:

bash bash_scripts/contrastive_train.sh

Extract features: Extract features to prepare for semi-supervised k-means. It will require changing the path for the model with which to extract features in warmup_model_dir

bash bash_scripts/extract_features.sh

Fit semi-supervised k-means:

bash bash_scripts/k_means.sh

Note on semi-supervised k-means

Under the old evaluation metric ('v1') we found that semi-supervised k-means consistently boosted performance over standard k-means, on 'Old' and 'New' data subsets. When we changed to 'v2' evaluation, we re-evaluated models in Tables {2,3,5} (including the ablation) and updated the figures.

However, recently, we have found that SS-k-means can be sensitive to bad initialisation under 'v2', and can sometimes lower performance on some datasets. Increasing the number of inits for SS-k-means can help. We are investigating this further now - suggestions and PRs welcome!

<a name="results"/> :1234: Results

Results from re-running models with this repo compared to reported numbers:

Stanford Cars (paper)39.057.629.9
Stanford Cars (repo)39.958.530.9
CIFAR100 (paper)70.877.657.0
CIFAR100 (repo)71.377.459.1

<a name="cite"/> :clipboard: Citation

If you use this code in your research, please consider citing our paper:

               title={Generalized Category Discovery},
               author={Sagar Vaze and Kai Han and Andrea Vedaldi and Andrew Zisserman},
               booktitle={IEEE Conference on Computer Vision and Pattern Recognition},