Home

Awesome

About

This repo contains the official implementation of the Aligning Latent and Image Spaces to Connect the Unconnectable paper. It is a GAN model which can generate infinite images of diverse and complex scenes.

<div style="text-align:center"> <img src="https://user-images.githubusercontent.com/3128824/116132029-bc294c00-a6d5-11eb-9bd4-ab125b1508b7.gif" alt="ALIS generation example"/> </div>

[Project page] [Paper]

PWC

Python 3.7 Pytorch 1.7

Installation

To install, run the following command:

conda env create --file environment.yml --prefix ./env
conda activate ./env

Note: the tensorboard requirement is crucial, because otherwise upfirdn2d will not compile for some magical reason. The repo should work both on Linux/MacOS and Windows machines. However, on Windows, there might arise difficulties with installing some requirements: please see #3 to troubleshoot. Also, since the current repo is heavily based on StyleGAN2-ADA, it might be helpful to check the original installation requirements.

Training

To train the model, navigate to the project directory and run:

python infra/launch_local.py hydra.run.dir=. +experiment_name=my_experiment_name +dataset=dataset_name num_gpus=4

where dataset_name is the name of the dataset without .zip extension inside data/ directory (you can easily override the paths in configs/main.yml). So make sure that data/dataset_name.zip exists and should be a plain directory of images. See StyleGAN2-ADA repo for additional data format details. This training command will create an experiment inside experiments/ directory and will copy the project files into it. This is needed to isolate the code which produces the model.

Inference

The inference example can be found in notebooks/generate.ipynb

Data format

We use the same data format as the original StyleGAN2-ADA repo: it is a zip of images. It is assumed that all data is located in a single directory, specified in configs/main.yml. Put your datasets as zip archives into data/ directory. It is recommended to preprocess the dataset with the procedure described in Algorithm 1 since it noticeably affects the results (see Table 3).

Pretrained checkpoints

We provide checkpoints for the following datasets:

LHQ dataset

Note: images are sorted by their likelihood. That's why images with smaller idx are much more noisy. We will release a filtered version soon.

<div style="text-align:center"> <img src="assets/lhq.png" alt="25 random images from LHQ" style="max-width: 500px"/> </div>

We collected 90k high-resolution nature landscape images and provide them for download in the following formats:

PathSizeNumber of filesFormatDescription
Landscapes HQ283G90,000PNGThe root directory with all the files
├  LHQ155G90,000PNGThe complete dataset. Split into 4 zip archives.
├  LHQ1024107G90,000PNGLHQ images, resized to min-side=1024 and center-cropped to 1024x1024. Split into 3 zip archives.
├  LHQ1024_jpg12G90,000JPGLHQ1024 converted to JPG format with quality=95 (with Pillow)*
├  LHQ2568.7G90,000PNGLHQ1024 resized to 256x256 with Lanczos interpolation
└  metadata.json27M1JSONDataset metadata (author names, licenses, descriptions, etc.)

*quality=95 in Pillow for JPG images (the default one is 75) provides images almost indistinguishable from PNG ones both visually and in terms of FID.

The images come with Unsplash/Creative Commons/U.S. Government Works licenses which allow distribution and use for research purposes. For details, see lhq.md and Section 4 in the paper.

Downloading files:

python download_lhq.py [DATASET_NAME]

License

The project is based on the StyleGAN2-ADA repo developed by NVidia. I am not a lawyer, but I suppose that NVidia License applies to the code of this project then. But the LHQ dataset is released under the Creative Commons Attribution 2.0 Generic (CC BY 2.0) License, which allows to use it in any way you like. See lhq.md.

BibTeX

@article{ALIS,
  title={Aligning Latent and Image Spaces to Connect the Unconnectable},
  author={Skorokhodov, Ivan and Sotnikov, Grigorii and Elhoseiny, Mohamed},
  journal={arXiv preprint arXiv:2104.06954},
  year={2021}
}