Home

Awesome

ImaginaryNet: Learning Object Detectors without Real Images and Annotations

License: MIT

This repository is for the ICLR 2023 paper: ImaginaryNet: Learning Object Detectors without Real Images and Annotations

If you use any source codes or ideas included in this repository for your work, please cite the following paper.

<pre> @article{ni2022imaginarynet, title={ImaginaryNet: Learning Object Detectors without Real Images and Annotations}, author={Ni, Minheng and Huang, Zitong and Feng, Kailai and Zuo, Wangmeng}, journal={arXiv preprint arXiv:2210.06886}, year={2022} } </pre>

If you have any questions, feel free to email me.

Abstract

Without the demand of training in reality, humans are able of detecting a new category of object simply based on the language description on its visual characteristics. Empowering deep learning with this ability undoubtedly enables the neural network to handle complex vision tasks, e.g., object detection, without collecting and annotating real images. To this end, this paper introduces a novel challenging learning paradigm Imaginary-Supervised Object Detection (ISOD), where neither real images nor manual annotations are allowed for training object detectors. To resolve this challenge, we propose ImaginaryNet, a framework to synthesize images by combining pretrained language model and text-to-image synthesis model. Given a class label, the language model is used to generate a full description of a scene with a target object, and the text-to-image model is deployed to generate a photo-realistic image. With the synthesized images and class labels, weakly supervised object detection can then be leveraged to accomplish ISOD. By gradually introducing real images and manual annotations, ImaginaryNet can collaborate with other supervision settings to further boost detection performance. Experiments show that ImaginaryNet can (i) obtain about 75% performance in ISOD compared with the weakly supervised counterpart of the same backbone trained on real data, (ii) significantly improve the baseline while achieving state-of-the-art or comparable performance by incorporating ImaginaryNet with other supervision settings.

Illustration of Framework

<img src="img/ImaginaryNet.png">

Preparation

You can run the following commands to start up the environment.

conda env create -f environment.yaml

conda activate imaginarynet

pip install --upgrade jax==0.3.25 jaxlib==0.3.25+cuda11.cudnn82 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

conda install -c conda-forge cudatoolkit-dev

Pipeline Usage

This pipeline provide the core function of ImaginaryNet: to generate images based on class label.

Quick Start

python imaginarynet.py --num 10000 --classfile voc.txt --gpt --clip --backend dalle-mini

Parameters Explanation

Reproducibility

To help improve the reproducibility of the community, we provide generated datasets, trained checkpoints, and training logs. Please note that generated images may not be re-generated exactly the same because of the update of the backend and the change of the environment. We did not modify the code of detection backbones. To start training of these backbones, please refer to their original repos. If you want to access the original data or experiments, please download our archives.

Generated Images

NameDownload Link
10,000 Imaginary DataDownload

Save Checkpoints and Logs

Imaginary-Supervised Object Detection (ISOD)

BackboneImaginary DatamAPCheckpointLog
OICR5K Imaginary35.43DownloadDownload

Weakly-Supervised Object Detection (WSOD)

BackboneImaginary DatamAPCheckpointLog
WSDDN5K Imaginary39.90DownloadDownload
OICR5K Imaginary51.39DownloadDownload
W2N5K Imaginary65.05DownloadDownload

Semi-Supervised Object Detection (SSOD)

BackboneReal DataImaginary DatamAPCheckpointLog
Unbiased-Teacher5K VOC20075K Imaginary80.36DownloadDownload
Unbiased-Teacher5K VOC200710K Imaginary80.60DownloadDownload
Unbiased-Teacher5K VOC2007 + 10K VOC2012 (un-labeled)10K Imaginary81.60DownloadDownload

Acknowledgement

We greatly appreciate Yeli Shen for his contribution in the public code of ImaginaryNet.