Home

Awesome

iNNfer

This is a companion repository to traiNNer, in order to more easily produce results with models trained with it.

Currently, the model architectures supported are for: Super-Resolution, Restoration (denoise, deblur) and image to image translation. Support for the remaining architectures (SRFlow, Video, etc) is planned.

Features

Below is a (non-comprehensive) list of features currently available in the project. More are planned (see below).

Planned features

Example simple usage

You need to provide a directory where the input images to be processed are located and an output where the results will be saved. By default, these directories will be ./input/ and ./output/ respectively, but you can modify those with the -input and -output flags.

If you obtain a trained model, either the original from a paper or from the model database, you can place it in the ./models/ directory.

As an example, if you want to use the Fatality model from the database, you will download the model (4x_Fatality_01_265000_G.pth) and move it to ./models/.

Once there and with the input images ready, you can run obtain the results simply by running:

python run.py -m fatal

And the results will be saved in ./output/.

More cases

Model chaining

To chain multiple models, you need to provide a sequence of model names to the -m flag. For example, to first remove JPEG artifacts and then upscale images, you can fetch one of the JPEG denoising models from the database (Example: 1x_JPEG_60-80.pth) and an upscaling model (Example: 4x_Fatality_01_265000_G.pth) and use a plus sign (+) between their names.

python run.py -m jpeg+fatal

Note that there's technically no limit to how many models can be chained, but if the models are for upscaling, image sizes can become impossible to manage in memory. This is mostly a hardware limitation. You can also chain the same model multiple times to the images, which can produce interesting results in some cases.

Image to image translation

For these cases, for now you'll need to provide the network architecture used to train the model. For example, from the trained model available for pix2pix and CycleGAN, that will correspond to unet_256 (or p2p_256) and resnet_9blocks (or cg_9) for the CycleGAN case.

For example, to try out the label2facade model (facades_label2photo.pth), you need to run:

python run.py -m facade -a p2p_256

This will produce a single result:

<p align="center"> <img src="https://user-images.githubusercontent.com/41912303/121805922-bbc91e00-cc4d-11eb-8961-accdd7eb4269.png" height="200"> </p>

For a side by side comparison between input and output, add the -comp flag:

python run.py -m facade -a p2p_256 -comp
<p align="center"> <img src="https://user-images.githubusercontent.com/41912303/121805956-e915cc00-cc4d-11eb-9d34-cb7683ad6b5f.png" height="200"> </p>

Similarly, to test the ukiyoe CycleGAN model (either photo2ukiyoe.pth or style_ukiyoe.pth), with a comparison run:

python run.py -m ukiyoe -a cg_9 -comp
<p align="center"> <img src="https://user-images.githubusercontent.com/41912303/121806039-5c1f4280-cc4e-11eb-8377-62c907039e3a.png" height="200"> </p>

White-box cartoonization (WBC)

For WBC, a special case is available, where the original TensorFlow model converted to PyTorch and available in the pretrained options can be used and produces the same results shown in the original repo, converting photos to anime cartoon style.

<p align="center"> <img src="https://user-images.githubusercontent.com/41912303/126866710-d95348f0-9daa-4a07-a9f6-42620d005896.png" height="200"> </p> <p align="center"> <img src="https://user-images.githubusercontent.com/41912303/126866734-1122f1ae-071e-4abc-bf6a-0950c508862e.png" height="200"> </p>

The models trained with PyTorch can also be used (here using wbc.pth):

<p align="center"> <img src="https://user-images.githubusercontent.com/41912303/126866846-b9657c59-5307-40ad-a83b-692232486a2d.png" height="200"> </p>

And if different models are trained with different representations scales, the resulting models can be interpolated to obtain intermediate results between two of them. For now this can be done with a simple script, but later this can be done on the fly by iNNfer (TBD). More information about interpolating models can be found here

You can also tweak the Guided Filter component in run.py (search for the note), and if the r is increased, the final output details can be reduced, depending on the expected results. More details about the guided filter are available in the original paper.

If the models are named wbc*, the wbcunet architecture and configuration will be automatically selected, otherwise add the -arch wbcunet flag when running.

TorchScript models

TorchScript models are directly supported, just be aware that these models need to be run in the same fashion they were traced. For example, if the option for using GPUs was used when they were created, they will only be able to run in NVidia GPUs with CUDA support. Here you will find a number of models from the model database that were already converted to TorchScript (using GPU, CPU versions can be made available if needed) and are ready to use.

For example, to use the 4xRealSR_DF2K_JPEG.pt model, just execute:

python run.py -m realsr

One of the advantages these TorchScript models have is that they no longer require explicit support of the network architecture, so you could use any model of any architecture that has been converted to TorchScript with this code, even if the architecture is not supported. This is useful if you need to use other features like the color correction.

Color correction option

Some models introduce color changes that may not be desired. For that reason, there are options that can be used to correct those changes.

Using an example image from the Manga109 set, with a model that intentionally introduces heavy color changes, run:

python run.py -m shin -comp

And this produces this result:

<p align="center"> <img src="https://user-images.githubusercontent.com/41912303/121806509-62aeb980-cc50-11eb-8f99-1cfe64ff446a.png" height="200"> </p>

To try to fix the colors, only add the color fix flag -cf, like:

python run.py -m shin -comp -cf

And you will obtain a version of the upscale that more closely matches the colors of the original image:

<p align="center"> <img src="https://user-images.githubusercontent.com/41912303/121806581-b91bf800-cc50-11eb-9193-d688472339d3.png" height="200"> </p>

This flag will work, even if multiple models are chained.

How to help

There are multiple ways to help this project. The first one is by using it and trying to produce results with your models. You can open an issue if you find any bugs or if you have ideas or questions.

If you would like to contribute in the form of adding or fixing code, you can do so be cloning this repo and creating a PR.

You can also join the discord servers and share results and questions with other users.

Lastly, after it has been suggested many times before, now there are options to donate to show your support to the project and help stir it in directions that will make it even more useful. Below you will find those options that were suggested.

<p align="left"> <a href="https://patreon.com/victorca25"> <img src="https://github.githubassets.com/images/modules/site/icons/funding_platforms/patreon.svg" height="30"> Patreon </a> </p> <p align="left"> <a href="https://user-images.githubusercontent.com/41912303/121814560-fba1fc80-cc71-11eb-9b98-17c3ce0f06d6.png"> <img src="https://user-images.githubusercontent.com/41912303/121814516-ca293100-cc71-11eb-9ddf-ffda840cd36d.png" height="30"> <img src="https://user-images.githubusercontent.com/41912303/121814560-fba1fc80-cc71-11eb-9b98-17c3ce0f06d6.png" height="30"> </a> Bitcoin Address: 1JyWsAu7aVz5ZeQHsWCBmRuScjNhCEJuVL </p> <p align="left"> <a href="https://user-images.githubusercontent.com/41912303/121814692-aa463d00-cc72-11eb-99b2-c1bae3f63fdc.png"> <img src="https://user-images.githubusercontent.com/41912303/121814599-36a43000-cc72-11eb-974a-146661e5e665.png" height="30"> <img src="https://user-images.githubusercontent.com/41912303/121814692-aa463d00-cc72-11eb-99b2-c1bae3f63fdc.png" height="30"> </a> Ethereum Address: 0xa26AAb3367D34457401Af3A5A0304d6CbE6529A2 </p>

Additional Help

If you have any questions, we have a couple of discord servers (game upscale and animation upscale) where you can ask them and a Wiki with more information.