Home

Awesome

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

PDF Slides Poster Video

<!-- (https://www.cs.jhu.edu/~zhuowan/zhuowan/SuperCLEVR/obj_part_list/all_objects.html) -->

This is the homepage for the [CVPR 2023 highlight (top 2.5%)] paper:

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

Zhuowan Li, Xingrui Wang, Elias Stengel-Eskin, Adam Kortylewski, Wufei Ma, Benjamin Van Durme, Alan Yuille.

In this paper, we generate the Super-CLEVR dataset to systematically study the domain robustness of visual reasoning models on four factors: visual complexity, question redundancy, concept distribution, concept compositionality.


Dataset

Super-CLEVR contains 30k images of vehicles (from UDA-Part) randomly placed in the scenes, with 10 question-answer pairs for each image. The vehicles have part annotations and so the objects in the images can have distinct part attributes.

Here [link] is the list of objects and parts in Super-CLEVR scenes.

<div align="center"> <img src="images/github.png" width="800px"> </div>

The first 20k images and paired are used for training, the next 5k for validation and the last 5k for testing.

DataDownload Link
imagesimages.zip
scenessuperCLEVR_scenes.json
questionssuperCLEVR_questions_30k.json
questions (- redundancy)superCLEVR_questions_30k_NoRedundant.json
questions (+ redundancy)superCLEVR_questions_30k_AllRedundant.json

Dataset generation

To generate images:

  1. Install Blender 2.79b. This repo is highly built on the CLEVR data generation code. Please refer to its README for additional details.
  2. Download CGPart dataset.
  3. Then we want to preprocess the 3D models. To do this, you may need to modify the input and output paths in image_generation/preprocess_cgpart.py, then run sh scripts/preprocess_cgpart.py.
  4. Next run sh scripts/render_images.sh to render images with GPUs.
  5. After the images and corresponding scene files are generated, you can use scripts/merge_scenes.py to merge the scene files into one json file (as output/superCLEVR_scenes.json).

10 example generated images and scenes are in output/images and output/scenes.

To generate questions


Aknowledgements

This repo is highly motivated by CLEVR and render-3d-segmentation.


Citation

If you find this code useful in your research then please cite:

@inproceedings{li2023super,
  title={Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning},
  author={Li, Zhuowan and Wang, Xingrui and Stengel-Eskin, Elias and Kortylewski, Adam and Ma, Wufei and Van Durme, Benjamin and Yuille, Alan L},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14963--14973},
  year={2023}
}