Home

Awesome

Novel Scenes & Classes: Towards Adaptive Open-set Object Detection (ICCV-23 ORAL)

[Paper Link] [Poster Link]

By Wuyang Li

<div align=center> <img src="./assets/mot.png" width="400"> </div>

Domain Adaptive Object Detection (DAOD) strongly assumes a shared class space between the two domains.

This work breaks through the assumption and formulates Adaptive Open-set Object Detection (AOOD), by allowing the target domain with novel-class objects.

The object detector uses the base-class labels in the source domain for training, and aims to detect base-class objects and identify novel-class objects as unknown in the target domain.

If you have any ideas and problems you hope to discuss, you can reach me via E-mail.

2024/02/29:

I sincerely apologize for the big mistake I made when cleaning and publishing my code. I sincerely apologize to readers who ran our code before and could not achieve similar results.

Since City Val only has 500 images and is insufficient to evaluate the open-set performance (e.g., AOSE), we follow the p2c setting to use all unlabeled data for evaluation. Please check on our corrected target domain dataset settings. I am so so sorry that I forgot this when I cleaned my code!! Besides, thank you for raising the issue https://github.com/CityU-AIM-Group/SOMA/issues/5#issue-2157929369 https://github.com/CityU-AIM-Group/SOMA/issues/4#issue-2061101062 to let me notice this mistake.

https://github.com/CityU-AIM-Group/SOMA/blob/97af7f0f1493383b47a77b75a932523f20b4cf75/datasets/DAOD.py#L44

šŸ’” Preparation

Step 1: Clone and Install the Project

(a) Clone the repository

git clone https://github.com/CityU-AIM-Group/SOMA.git

(b) Install the project following Deformable DETR

Note that the following is in line with our experimental environments, which is slightly different from the official one.

# Linux, CUDA>=9.2, GCC>=5.4
# (ours) CUDA=10.2, GCC=8.4, NVIDIA V100 
# Establish the conda environment

conda create -n aood python=3.7 pip
conda activate aood
conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

# Compile the project
cd ./models/ops
sh ./make.sh

# unit test (should see all checking is True)
python test.py

# NOTE: If you meet the permission denied issue when starting the training
cd ../../ 
chmod -R 777 ./

Step 2: Download Necessary Resources

(a) Download pre-processed datasets (VOC format) from the following links

(Foggy) CityscapesPascal VOCClipartBDD100K (Daytime)
Official LinksImgsImgs+Labels-Imgs
Our LinksLabels-Imgs+LabelsLabels

(b) Download DINO-pretrained ResNet-50 from this link

Step 3: Change the Path

(a) Change the data path as follows.

[DATASET_PATH]
ā””ā”€ Cityscapes
   ā””ā”€ AOOD_Annotations
   ā””ā”€ AOOD_Main
      ā””ā”€ train_source.txt
      ā””ā”€ train_target.txt
      ā””ā”€ val_source.txt
      ā””ā”€ val_target.txt
   ā””ā”€ leftImg8bit
      ā””ā”€ train
      ā””ā”€ val
   ā””ā”€ leftImg8bit_foggy
      ā””ā”€ train
      ā””ā”€ val
ā””ā”€ bdd_daytime
   ā””ā”€ Annotations
   ā””ā”€ ImageSets
   ā””ā”€ JPEGImages
ā””ā”€ clipart
   ā””ā”€ Annotations
   ā””ā”€ ImageSets
   ā””ā”€ JPEGImages
ā””ā”€ VOCdevkit
   ā””ā”€ VOC2007
   ā””ā”€ VOC2012

For bdd100k daytime, put all images into bdd_daytime/JPEGImages/*.jpg.

The image settings for other benchmarks are consistent with SIGMA.

(b) Change the data root in the config files

Replace the DATASET.COCO_PATH in all yaml files in config by your data root $DATASET_PATH, e.g., https://github.com/CityU-AIM-Group/SOMA/blob/41c11cbcb3589376f956950209d5ae3fbc839792/configs/soma_aood_city_to_foggy_r50.yaml#L22

(c) Change the path of DINO-pretrained backbone

Replace the backbone loading path: https://github.com/CityU-AIM-Group/SOMA/blob/41c11cbcb3589376f956950209d5ae3fbc839792/models/backbone.py#L107

šŸ”„ Start Training

We use two GPUs for training with 2 source images and 2 target images as input. Please take a look at the generated eval_results.txt file in OUTPUT_DIR, which saves the per-epoch evaluation results in the latex table format.

GPUS_PER_NODE=2 
./tools/run_dist_launch.sh 2 python main_multi_eval.py --config_file {CONFIG_FILE} --opts DATASET.AOOD_SETTING 1

We provide some scripts in our experiments in run.sh. After "--opts", the settings will overwrite the default config file as the maskrcnn-benchmark framework.

šŸ“¦ Well-trained models

Will be provided later

<!-- | Source| Target| Task | mAP $_b$ | AR $_n$ | WI | AOSE | AP@75 | checkpoint | | :-----:| :-----:| :-----:| :-----:| :-----:| :-----:| :-----:| :-----:| :-----: | City |Foggy | het-sem | | City |Foggy | het-sem | | City |Foggy | het-sem | | City |Foggy | het-sem | -->

šŸ’¬ Notification

-opts AOOD.OW_DETR_ON True

šŸ“ Citation

If you think this work is helpful for your project, please give it a star and citation. We sincerely appreciate your acknowledgment.

@InProceedings{Li_2023_ICCV,
    author    = {Li, Wuyang and Guo, Xiaoqing and Yuan, Yixuan},
    title     = {Novel Scenes \& Classes: Towards Adaptive Open-set Object Detection},
    booktitle = {ICCV},
    year      = {2023},
}

Relevant project:

Exploring a similar task for the image classification. [link]

@InProceedings{Li_2023_CVPR,
    author    = {Li, Wuyang and Liu, Jie and Han, Bo and Yuan, Yixuan},
    title     = {Adjustment and Alignment for Unbiased Open Set Domain Adaptation},
    booktitle = {CVPR},
    year      = {2023},
}

šŸ¤ž Acknowledgements

We greatly appreciate the tremendous effort for the following works.

šŸ“’ Abstract

Domain Adaptive Object Detection (DAOD) transfers an object detector to a novel domain free of labels. However, in the real world, besides encountering novel scenes, novel domains always contain novel-class objects de facto, which are ignored in existing research. Thus, we formulate and study a more practical setting, Adaptive Open-set Object Detection (AOOD), considering both novel scenes and classes. Directly combing off-the-shelled cross-domain and open-set approaches is sub-optimal since their low-order dependence, such as the confidence score, is insufficient for the AOOD with two dimensions of novel information. To address this, we propose a novel Structured Motif Matching (SOMA) framework for AOOD, which models the high-order relation with motifs, i.e., statistically significant subgraphs, and formulates AOOD solution as motif matching to learn with high-order patterns. In a nutshell, SOMA consists of Structure-aware Novel-class Learning (SNL) and Structure-aware Transfer Learning (STL). As for SNL, we establish an instance-oriented graph to capture the class-independent object feature hidden in different base classes. Then, a high-order metric is proposed to match the most significant motif as high-order patterns, serving for motif-guided novel-class learning. In STL, we set up a semantic-oriented graph to model the class-dependent relation across domains, and match unlabelled objects with high-order motifs to align the cross-domain distribution with structural awareness. Extensive experiments demonstrate that the proposed SOMA achieves state-of-the-art performance.

image