Awesome
neosr
neosr is a framework for training real-world single-image super-resolution networks.
installation
Requires Python 3.11 and CUDA =>11.8
git clone https://github.com/muslll/neosr
cd neosr
Install latest Pytorch (=>2.1) and TorchVision (required).
Then install other dependencies via pip
:
pip install -e .
Alternatively, use poetry
(recommended on linux):
poetry install && poetry add torch@latest torchvision@latest
Note: You must use poetry shell
to enter the env after installation.
(optional) If you want to convert your models (convert.py), you need to following dependencies:
pip install onnx onnxruntime-gpu onnxconverter-common onnxsim
You can also install using poetry (recommended on linux):
poetry add onnx onnxruntime-gpu onnxconverter-common onnxsim
Please read the wiki tutorial for converting your models.
quick start
Start training by running:
python train.py -opt options.yml
Where options.yml
is a configuration file. Templates can be found in options.
Please read the wiki Configuration Walkthrough for an explanation of each option.
features
Currently included archs:
arch | option |
---|---|
ESRGAN | old_esrgan |
Real-ESRGAN | esrgan |
SRVGGNetCompact | compact |
SwinIR | swinir_small , swinir_medium |
HAT | hat_s , hat_m , hat_l |
OmniSR | omnisr |
SRFormer | srformer_light , srformer_medium |
DAT | dat_light , dat_small , dat_medium , dat_2 |
DITN | ditn |
DCTLSA | dctlsa |
SPAN | span |
NLSAN | nlsan_medium , nlsan_light |
DWT | dwt |
EDAT | edat , edat_light |
CRAFT | craft |
Bicubic++ | bpp , bpp_l |
Real-CUGAN | cugan |
Arch Inference times with provided testscript, rtx 3060
type | fps |
---|---|
bicubic ++ | 1.76 |
compact | 1.37 |
span | 0.92 |
ditn | 0.76 |
omnisr | 0.54 |
swinir_small | 0.49 |
craft | 0.49 |
srformer_light | 0.43 |
nlsan_light | 0.38 |
dctlsa | 0.35 |
dat_light | 0.35 |
esrgan | 0.18 |
swinir_medium | 0.13 |
dwt_light | 0.12 |
dat_small | 0.08 |
dat_2 | 0.08 |
dat_medium | 0.07 |
Supported Discriminators:
net | option |
---|---|
U-Net SN | unet |
A2-FPN | a2fpn |
Supported Optimizers:
optimizer | option |
---|---|
Adam | Adam or adam |
AdamW | AdamW or adamw |
Lion | Lion or lion |
LAMB | Lamb or lamb |
Adan | Adan or adan |
Supported models:
model | description | option |
---|---|---|
Default | Base model, supports both Generator and Discriminator | default |
OTF | Builds on top of default , adding Real-ESRGAN on-the-fly degradations | otf |
Supported dataset loaders:
loader | option |
---|---|
Paired datasets | paired |
Single datasets (for inference, no GT required) | single |
Real-ESRGAN on-the-fly degradation | otf |
Supported losses:
loss | option |
---|---|
L1 Loss | L1Loss , l1 |
L2 Loss | MSELoss , l2 |
Huber Loss | HuberLoss , huber |
Perceptual Loss | perceptual_opt , PerceptualLoss |
GAN | gan_opt , GANLoss , MultiScaleGANLoss |
YUV Color Loss | color_opt , colorloss |
LDL Loss | ldl_opt |
Focal Frequency | ff_opt , focalfrequencyloss |
datasets
If you don't have a dataset, you can either download research datasets like DIV2K or use one of the following.
nomos_uni
(recommended): universal dataset containing real photographs and anime imagesnomos8k
: dataset with real photographs onlyhfa2k
: anime dataset
These datasets have been tiled and manually curated across multiple sources, including DIV8K, Adobe-MIT 5k, RAISE, FFHQ, etc.
dataset | num images | meta_info | download | sha256 |
---|---|---|---|---|
nomos_uni | 2989 (512x512px) | nomos_uni_metainfo.txt | GDrive (1.3GB) | 6403764c3062aa8aa6b842319502004aab931fcab228f85eb94f14f3a4c224b2 |
nomos_uni (lmdb) | 2989 (512x512px) | - | GDrive (1.3GB) | 596e64ec7a4d5b5a6d44e098b12c2eaf6951c68239ade3e0a1fcb914c4412788 |
nomos_uni (LQ 4x) | 2989 (512x512px) | nomos_uni_metainfo.txt | GDrive (92MB) | c467e078d711f818a0148cfb097b3f60763363de5981bf7ea650dad246946920 |
nomos_uni (LQ 4x - lmdb) | 2989 (512x512px) | - | GDrive (91MB) | 1d770b2c6721c97bd2679db68f43a9f12d59a580e9cfeefd368db5a4fab0f0bb |
nomos8k | 8492 (512x512px) | nomos8k_metainfo.txt | GDrive (3.4GB) | 89724f4adb651e1c17ebee9e4b2526f2513c9b060bc3fe16b317bbe9cd8dd138 |
hfa2k | 2568 (512x512px) | hfa2k_metainfo.txt | GDrive (3.2GB) | 3a3d2293a92fb60507ecd6dfacd636a21fd84b96f8f19f8c8a55ad63ca69037a |
Note: these are not intended for use in academic research.
community datasets
These are datasets made by the upscaling community. More info can be found in the Enhance Everything discord
-
kim's 8k Dataset V2
: Video Game Dataset -
FaceUp
: Curated version of FFHQ -
SSDIR
: Curated version of LSDIR.
dataset | num images | meta_info | download | sha256 |
---|---|---|---|---|
@Kim2091's 8k Dataset V2 | 672 (7680x4320px) | - | GDrive (33.5GB) | - |
@Phhofm FaceUp | 10000 (512x512) | - | GDrive (4GB) | - |
@Phhofm SSDIR | 10000 (512x512) | - | Gdrive (4.5GB) | - |
resources
- OpenModelDB
- chaiNNer
- Training Guide from @Sirosky
- Training Info from @Kim
support me
☕ Consider supporting me on KoFi. ☕
license and acknowledgements
Released under the Apache license. This code was originally based on BasicSR. See other licenses in license/readme.
Thanks to victorca25/traiNNer, styler00dollar/Colab-traiNNer and timm for providing helpful insights into some problems.
Thanks to contributors @Phhofm, @Sirosky, @Kim2091 and @terrainer for helping with tests and bug reporting.