Home

Awesome

CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers

CVPR 2024

This project presents CuVLER, Cut-Vote-and-Learn pipeline for unsupervised object segments discovery. In this pipline a class-agnostic detector is trained from pseudo masks generated by VoteCut - a method that combines knowledge from number of self-supervised models to discover objects and computes a score for each mask that indicates object-likelihood.

Installation

see INSTALL.md

Datasets Preparation

See datasets/README.md

VoteCut - create pseudo masks <a id="votecut"></a>

We use VoteCut to create pseudo masks ImageNet training set. Make sure to have the ImageNet dataset set up as described in datasets/README.md. Creating masks for all ImageNet train-set is computationally heavy, therefore, we provide code leveraging SLURM to perform VoteCut in parallel using submitit. For computation efficiency we perform VoteCut in two stages: 1) We first calculate the NCut eigenvectors for each image for each model leveraging GPU. 2) We perform the rest of VoteCut pipeline in multiple process leveraging CPU (paralleled image-wise).

Creating eigenvectors example (See extract_eig_vecs_submitit.py arguments for more details):

cd path/to/CuVLER
python extract_eig_vec_submitit.py --split train --num-jobs 10 --out-dir datasets/imagenet --slurm-partition <partition>

You can also run the script without submitit (However, it will be slower):

python extract_eig_vec.py --split train --out-dir datasets/imagenet

Eigenvectors are saved in {out-dir}/eig_vecs_{split} directory. After eigenvectors are created, we can run:

python create_pseudo_masks_submitit.py \
 --out-file datasets/imagenet/annotations/imagenet_train_votecut_kmax_3_tuam_0.2.json \
    --split train \ 
       --num-jobs 100 \
         --slurm-partition <partition>

Note that the number of jobs should be adjusted to the number of available CPU cores. High number of jobs will result in faster execution. You can also run the script without submitit (However, it will be much slower):

python create_pseudo_masks.py --split train --out-file datasets/imagenet/annotations/imagenet_train_votecut_kmax_3_tuam_0.2.json \
 --out-dir datasets/imagenet

You can also download the precomputed pseudo masks, following the instructions in datasets/README.md.

Train CuVLER

This project trains a Cascade R-CNN model using Detectron2. Make sure to have the ImageNet dataset set up as described in datasets/README.md.

Zero-shot model training

Run the following command to train the model using 8 GPUs (You can adjust the number of GPUs):

cd path/to/CuVLER
python cad/train_net.py \
 --config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_votecut_cad.yaml \
    --num-gpus 8

Out-of-domain self-training

First you need to download the pre-trained zero-shot model from Models or train it yourself. Then, you need to perform inference on the target dataset (coco 2017 train-set):

cd path/to/CuVLER
python cad/train_net.py \
 --config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_votecut_cad.yaml \
    --num-gpus 8 \
      --eval-only \
      --test-dataset coco_2017_train
       MODEL.WEIGHTS path/to/model_cuvler_zero_shot.pth \
        OUTPUT_DIR output/output_coco_train_2017

Then, create a COCO-style pseudo-annotations file from the predictions:

python utils/self_training_ann.py --detectron2-out-dir output/output_coco_train_2017 \
    --coco-ann-path datasets/coco/annotations/instances_train2017.json \
     --save-path-prefix datasets/coco/annotations/coco_cls_agnostic_instances_train2017 \
      --threshold 0.2

Now you can train the model using the pseudo-annotations:

python cad/train_net.py \
 --config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_self_train.yaml \
    --num-gpus 8 \
       MODEL.WEIGHTS path/to/model_cuvler_zero_shot.pth \
        OUTPUT_DIR output/soft_self_train

CuVLER Models <a id="models"></a>

<table><tbody> <!-- START TABLE --> <!-- TABLE HEADER --> <th valign="bottom">Method</th> <th valign="bottom">Backbone</th> <th valign="bottom">Model</th> <!-- TABLE BODY --> <tr> <td align="left">CuVLER Zero-shot</td> <td align="left">Cascade R-CNN R50-FPN</td> <td align="left"><a href="https://drive.google.com/uc?export=download&id=16PHrqWvqfgcZfO5IfcpmAxCG2QYaQsEM">download</a></td> </tr> <tr> <td align="left">CuVLER Self-trained</td> <td align="left">Cascade R-CNN R50-FPN</td> <td align="left"><a href="https://drive.google.com/uc?export=download&id=1jkAnc5KX45gmwnzcwaHjxTSq5U3-JAYD">download</a></td> </tr> </tbody></table>

For easy download in Linux machines, you can use the following commands to download the models:

cd path/to/save/directory
python path/to/CuVLER/utils/gdrive_download.py --model {zero_shot, self_trained}

Evaluate CuVLER

Before evaluation, make sure you have the dataset set up for evaluation as described in datasets/README.md. We predefined the datasets in a detectron2 fashion, you can find the predefined dataset names you can use in the predefined splits dictionaries.

Zero-shot evaluation

First you need to download the pre-trained zero-shot model from Models or train it yourself. For example, run the following command to evaluate the model on coco 2017 validation-set:

cd path/to/CuVLER
python cad/train_net.py \
 --config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_votecut_cad.yaml \
    --num-gpus 8 \
      --eval-only \
      --test-dataset cls_agnostic_coco_val_17 \
       MODEL.WEIGHTS path/to/model_cuvler_zero_shot.pth \
        OUTPUT_DIR output/output_imagenet_val

Out-of-domain self-train evaluation

First you need to download the self-trained model from Models or train it yourself. Then, you need to perform inference on the target dataset (coco, coco20k or LVIS). Example for coco 2017 validation-set:

cd path/to/CuVLER
python cad/train_net.py \
 --config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_votecut_cad.yaml \
    --num-gpus 8 \
      --eval-only \
      --test-dataset cls_agnostic_coco_val_17 \
       MODEL.WEIGHTS path/to/model_cuvler_zero_shot.pth \
        OUTPUT_DIR output/output_imagenet_val

VoteCut evaluation on ImageNet

To evaluate the performance of VoteCut on ImageNet validation-set you first need to download the class-agnostic ground-truth from here. You can download the precomputed VoteCut pseudo masks from here. To evaluate the performance of VoteCut on ImageNet validation-set run:

cd path/to/CuVLER
python evalate.py --gt_ann_file path/to/imagenet_val_cls_agnostic_gt.json \
    --res_file path/to/votecut_annotations_imagenet_val.json \
        --pseudo_labels

If you want to run VoteCut yourself, you can follow the instructions in VoteCut. Here is an example running VoteCut using submitit. First, create eigenvectors:

python extract_eig_vec_submitit.py --split val --num-jobs 10 --out-dir datasets/imagenet --slurm-partition <partition>

Then, create final pseudo masks file:

python create_pseudo_masks_submitit.py --split val \
 --out-file path/to/votecut_annotations_imagenet_val.json \
  --num-jobs 100 \
   --slurm-partition <partition>

Acknowledgements

Part of this project is borrowed from CutLER, we thank the authors for their contribution.

License

Portion of this project belonging to CutLER, Detectron2 and DINO are released under the CC-BY-NC license. All other parts are under the MIT license.

Citation

If you use CuVLER in your research, please cite the following paper:

@inproceedings{arica2024cuvler,
  title={CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers},
  author={Arica, Shahaf and Rubin, Or and Gershov, Sapir and Laufer, Shlomi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23105--23114},
  year={2024}
}