Home

Awesome

ExFractalDB and RCDB

TOC

Summary

The repository contains ExFractalDB (Extended Fractal DataBase) and RCDB (Radial Contour DataBase) Construction, Pre-training, and Fine-tuning in Python/PyTorch.

The repository is based on the paper: Hirokatsu Kataoka, Ryo Hayamizu, Ryosuke Yamada, Kodai Nakashima, Sora Takashima, Xinyu Zhang, Edgar Josafat Martinez-Noriega, Nakamasa Inoue and Rio Yokota, "Replacing Labeled Real-Image Datasets With Auto-Generated Contours", IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022 [Project] [PDF (CVPR)] [Dataset] [Oral] [Poster] [Supp]

Updates

<!-- TODO update --> <!-- **Update (Mar 23, 2022)** * The paper was accepted to International Journal of Computer Vision (IJCV). We updated the scripts and pre-trained models in the extended experiments. [[PDF](https://link.springer.com/content/pdf/10.1007/s11263-021-01555-8.pdf)] [[Pre-trained Models](https://drive.google.com/drive/folders/1tTD-cKKEgBjacCi4ZJ6bRYOv6FsjtGt_?usp=sharing)] **Update (May 22, 2021)** * Related project "Can Vision Transformers Learn without Natural Images?" was released. We achieved to train vision transformers (ViT) without natural images. [[Project](https://hirokatsukataoka16.github.io/Vision-Transformers-without-Natural-Images/)] [[PDF](https://arxiv.org/abs/2103.13023)] [[Code](https://github.com/nakashima-kodai/FractalDB-Pretrained-ViT-PyTorch)] -->

Update (June 13, 2022)

Citation

If you use this code, please cite the following paper:

@InProceedings{Kataoka_2022_CVPR,
    author    = {Kataoka, Hirokatsu and Hayamizu, Ryo and Yamada, Ryosuke and Nakashima, Kodai and Takashima, Sora and Zhang, Xinyu and Martinez-Noriega, Edgar Josafat and Inoue, Nakamasa and Yokota, Rio},
    title     = {Replacing Labeled Real-Image Datasets With Auto-Generated Contours},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {21232-21241}
}

Requirements

Please install packages with the following command. (use conda env)

$ conda env create -f conda_requirements.yaml
$ conda activate cvpr2022_env
/PATH/TO/IMAGENET/
  train/
    class1/
      img1.jpeg
      ...
    class2/
      img2.jpeg
      ...
    ...
  val/
    class1/
      img3.jpeg
      ...
    class2/
      img4.jpeg
      ...
    ...

Execution files

We prepared four execution files in exe_scripts directory. Please type the following commands on your environment. You can execute ExFractalDB (Extended Fractal DataBase) and RCDB (Radial Contour DataBase) Construction, Pre-training, and Fine-tuning.

Note

<!-- TODO update -->

ExFractalDB Construction (README)

$ cd exfractaldb_render
$ bash ExFractalDB_render.sh
<!-- TODO update -->

RCDB Construction (README)

$ cd rcdb_render
$ bash RCDB_render.sh

Pre-training

Run the python script pretrain.py, you can pre-train with your dataset.

Basically, you can run the python script pretrain.py with the following command.

Or you can run the job script scripts/pretrain.sh (support multiple nodes training with OpenMPI). Note, the setup is multiple nodes and using a large number of GPUs (32 nodes and 128 GPUs for pre-train).

When running with the script above, please make your dataset structure as following.

/PATH/TO/ExFractalDB21000/
  cat000000/
    img0_000000_000000_000.png
      ...
  cat000001/
    img0_000001_000000_000.png
      ...
  ...
  cat002099/
    img0_002099_000000_000.png
      ...

After above pre-training, trained models are created like output/pretrain/pretrain_deit_base_ExFractalDB21000_1.0e-3/model_best.pth.tar and output/pretrain/pretrain_deit_base_ExFractalDB21000_1.0e-3/last.pth.tar. Moreover, you can resume the training from a checkpoint by setting --resume parameter.

Please see the script and code files for details on each arguments.

Pre-training with shard dataset

Shard dataset is also available for accelerating IO processing. To make shard dataset, please refer to this repository: https://github.com/webdataset/webdataset. Here is an Example of training with shard dataset.

​ When running with the script above with shard dataset, please make your shard dataset structure as following.

/PATH/TO/ExFractalDB21000/
    SHARDS-000000.tar
    SHARDS-000001.tar
    ...
    SHARDS-002099.tar

Pre-trained models

Our pre-trained models are available in this [Link].

We have mainly prepared two different pre-trained models. These pre-trained models are trained on {ExFractalDB, RCDB}-21k.

exfractal_21k_base.pth.tar: --model deit_base_patch16_224 --experiment pretrain_deit_base_ExFractalDB21000_1.0e-3_shards
rcdb_21k_base.pth.tar: --model deit_base_patch16_224 --experiment pretrain_deit_base_RCDB21000_1.0e-3_shards

If you would like to additionally train from the pre-trained model, please command with the next fine-tuning code as follows.

# exfractal_21k_base.pth.tar
$ mpirun -npernode 4 -np 4 \
  python finetune.py /PATH/TO/YOUR_FT_DATASET \
    --model deit_base_patch16_224 --experiment finetune_deit_base_YOUR_FT_DATASET_from_ExFractalDB21000_1.0e-3_shards \
    --input-size 3 224 224 --num-classes YOUR_FT_DATASET_CATEGORY_SIZE \
    --output ./output/finetune \
    --log-wandb \
    --pretrained-path /PATH/TO/exfractal_21k_base.pth.tar

# rcdb_21k_base.pth.tar
$ mpirun -npernode 4 -np 4 \
  python finetune.py /PATH/TO/YOUR_FT_DATASET \
    --model deit_base_patch16_224 --experiment finetune_deit_base_YOUR_FT_DATASET_from_RCDB21000_1.0e-3_shards \
    --input-size 3 224 224 --num-classes YOUR_FT_DATASET_CATEGORY_SIZE \
    --output ./output/finetune \
    --log-wandb \
    --pretrained-path /PATH/TO/rcdb_21k_base.pth.tar

Fine-tuning

Run the python script finetune.py, you additionally train other datasets from your pre-trained model.

In order to use the fine-tuning code, you must prepare a fine-tuning dataset (e.g., ImageNet-1k, CIFAR-10/100, Pascal VOC 2012). Please look at Requirements for a dataset preparation.

Basically, you can run the python script finetune.py with the following command.

Or you can run the job script scripts/finetune.sh (support multiple nodes training with OpenMPI).

Please see the script and code files for details on each arguments.

Acknowledgements

Training codes are inspired by timm and DeiT.

Terms of use

The authors affiliated in National Institute of Advanced Industrial Science and Technology (AIST) and Tokyo Institute of Technology (TITech) are not responsible for the reproduction, duplication, copy, sale, trade, resell or exploitation for any commercial purposes, of any portion of the images and any portion of derived the data. In no event will we be also liable for any other damages resulting from this data or any derived data.