Awesome
On_the_Utility_of_Synthetic_Data
Code release for "A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation", accepted by CVPR2023.
Project Page $\cdot$ PDF Download $\cdot$ Dataset Download
One can get access to our datasets via Google Drive (above) or Baidu Netdisk (below).
Requirements
- python 3.8.8
- pytorch 1.8.0
- torchvision 0.9.0
Data preparation
- Existing datasets. The references of the used existing datasets (e.g., ShapeNet, VisDA-2017, ImageNet, MetaShift, and ObjectNet) are included in the paper.
- Our new datasets. One can download them from the link above; otherwise, download them from the link below. SynSL is here, our synthesized 12.8M images of 10 classes (for supervised learning, termed SynSL); SynSL-120K is here, including our synthesized 120K images of 10 classes (train), train+SubImageNet, and three types of test data (i.e., test_IID, test_IID_wo_BG, and test_OOD); S2RDA is here, including two challenging transfer tasks of S2RDA-49 and S2RDA-MS-39. Please refer to the paper for more details.
- The validation and test splits of the real domains in S2RDA are also provided in this repository.
Model training
- Install necessary python packages.
- Replace data paths in run.sh with those in one's own system.
- Run command
sh run.sh
.
The results are saved in the folder ./checkpoints/
.
Pre-trained checkpoints
The pre-trained checkpoints for downstream synthetic-to-real classification adaptation are available here or there, and they are obtained at the last pre-training iteration.
Paper citation
@InProceedings{tang2023new,
author = {Tang, Hui and Jia, Kui},
title = {A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
}