Home

Awesome

DisUnknown: Distilling Unknown Factors for Disentanglement Learning

See introduction on our project page

Requirements

Preparing Datasets

Dataset classes and default configurations are provided for the following datasets. See below for how to add new datasets, or you can open an issue and the author might consider adding it. Some datasets need to be prepared before using:

$ python disentangler.py prepare_data <dataset_name> --data_path </path/to/dataset>

If the dataset does not have a standard training/test split it will be split randomly. Use the --test_portion <portion> option to set the portion of test samples. Some dataset have additional options.

Training

To train, use

$ python disentangler.py train --config_file </path/to/config/file> --save_path </path/to/save/folder>

The configuration file is in YAML. See the commented example for explanations. If config_file is omitted, it is expected that save_path already exists and contains config.yaml. Otherwise save_path will be created if it does not exist, and config_file will be copied into it. If save_path already contains a previous training run that has been halted, it will by default resume from the latest checkpoint. --start_from <stage_name> [<iteration>] can be used to choose another restarting point. --start_from stage1 to restart from scratch. Specifying --data_path or --device will override those settings in the configuration file.

Although our goal is to deal with the cases where some factors are labeled and some factors are unknown, it feels wrong not to extrapolate to the cases where all factors are labeled or where all factors are unknown. Wo do allow these, but some parts of our method will become unnecessary and will be discarded accordingly. In particular if all factors are unknown then we just train a VAE in stage I and then a GAN having the same code space in stage II, so you can use this code for just training a GAN. We don't have the myriad of GAN tricks though.

Meaning of Visualization Images

During training, images generated for visualization will be saved in the subfolder samples. test_images.jpg contains images from the test set in even-numbered columns (starting from zero), with odd-numbered columns being empty. The generated images will contain corresponding reconstructed images in even-numbered columns, while each image in odd-numbered columns is generated by combining the unknown code from its left and the labeled code from its right (warp to the next row).

Example test images:

Test images

Example generated images:

Generated_images

Adding a New Dataset

__init__() should accept four positional arguments root, part, labeled_factors, transform in that order, plus any additional keyword arguments that one expects to receive from dataset_args in the configuration file. root is the path to the dataset folder. transform is as usual. part can be train, test or plot, specifying which subset of the dataset to load. The plotting set is generally the same as the test set, but part = 'plot' is passed in so that a smaller plotting set can be used if the test set is too large.

labeled_factors is a list of factor names. __getitem__() should return a tuple (image, labels) where image is the image and labels is a one-dimensional PyTorch tensor of type torch.int64, containing the labels for that image in the order listed in labeled_factors. labels should always be a one-dimensional tensor even if there is only one labeled factor, not a Python int or a zero-dimensional tensor. If labeled_factors is empty then __getitem__() should return image only.

In addition, metadata about the factors should be available in the following properties: nclass should be a list of ints containing the number of classes of each factor, and class_freq should be a list of PyTorch tensors, each being one-dimensional, containing the distribution of classes of each factor in (the current split of) the dataset.

If any preparation is required, implement a static method prepare_data(args) where args is a return value of argparse.ArgumentParser.parse_args(), containing properties data_path and test_portion by default. If additional command-line arguments are needed, implement a static method add_prepare_args(parser) where parser.add_argument() can be called.

Finally add it to the dictionary of recognized datasets in data/__init__.py.

Default configuration should also be created as default_config/datasets/<dataset_name>.yaml. It should at a minimum contain image_size, image_channels and factors. factors has the same syntax as labeled_factors as explained in the example training configuration. It should contain a complete list of all factors. In particular, if the dataset does not include a complete set of labels, there should be a factor called unknown which will become the default unknown factor if labeled_factors is not set in the training configuration.

Any additional settings in the default configuration will override global defaults in default_config/default_config.yaml.

Citing This Work (BibTeX)

@inproceedings{xiang2021disunknown,
  title={DisUnknown: Distilling Unknown Factors for Disentanglement Learning},
  author={Xiang, Sitao and Gu, Yuming and Xiang, Pengda and Chai, Menglei and Li, Hao and Zhao, Yajie and He, Mingming},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={14810--14819},
  year={2021}
}