Home

Awesome

This is a simple repository meant to document current software and websites for using and training models

Section Links:

<br/>

:speech_balloon: Discord Communities:

<br/>

:tv: Upscaling Images or Video

Using Models

:wrench: Inference Tools:

:mag: Finding Models:

:spiral_notepad: Written Guides:

<br/>

:running_shirt_with_sash: Training Models

Training Upscaling Models:

✅ Recommended software:

<details closed> <summary>Other training software (not recommended)</summary> <br> </details>

:scissors: Dataset Preparation:

:microscope: Outputs / Image Comparison:

:memo: Training Guides & Info:

Training Loss Information

<details> <summary>Descriptions of each loss available in most training software</summary> Here is a brief summary of some of the loss functions that are used for super resolution and image restoration tasks. Thanks to korvin for the info!
  1. L1Loss: This is the mean absolute error (MAE) between the predicted and target images. It measures the average pixel-wise difference, and is simple to implement and fast to compute. However, it may produce blurry results and does not account for perceptual quality or high-frequency details. It can be used for any type of image. For example, it is suitable for low-level tasks such as denoising or inpainting, but also super resolution. It can be combined with other losses such as perceptual loss or GAN loss to improve the results.

  2. LRGBLoss: This is a variant of L1Loss that computes the MAE separately for each color channel (red, green, blue) and then averages them. It is similar to L1Loss in terms of advantages and disadvantages, but it may be more sensitive to color differences. It can be used for any type of images, but it may not be optimal for grayscale images or images with different color spaces3.

  3. PerceptualLoss: This is a loss function that uses a pre-trained network, such as VGG, to extract high-level features from the predicted and target images and then computes the MAE (or other measures) between them. It aims to capture the perceptual similarity and semantic content of the images, rather than the pixel-wise difference. It can produce more natural and realistic results, especially for high-level tasks such as super resolution or style transfer. However, it is computationally expensive, requires regularization and hyper-parameter tuning, and involves a large network trained on an unrelated task. It can be used for any type of images, but it may not be optimal for low-level tasks or images with different domains24.

  4. ContextualLoss: This is a loss function that measures the similarity between two images based on the distribution of local patches. It uses a cosine similarity metric to compare the patches and then aggregates them using a generalized mean function. It can capture both global and local structures, as well as texture and style information. It can produce more diverse and detailed results, especially for texture synthesis or style transfer. However, it is computationally expensive, requires patch size selection and normalization, and may not be robust to geometric transformations or occlusions. It can be used for any type of images, but it may not be optimal for images with large variations or complex semantics5.

  5. ColorLoss: There are many types of color loss. An explicit example would be this: a loss function that measures the color difference between two images using the CIEDE2000 formula, which is based on the human perception of color and accounts for factors such as luminance, hue, chroma, and contrast. It can produce more accurate and consistent color reproduction, especially for color enhancement or correction. However, it is computationally expensive, requires color space conversion and calibration, and may not capture other aspects of image quality such as sharpness or noise. It can be used for any type of images, but it may not be optimal for grayscale images or images with different color spaces.

  6. AverageLoss: This is a loss function that computes the average of multiple loss functions, such as L1Loss, PerceptualLoss, ColorLoss, etc. It can combine the advantages of different losses and balance their trade-offs. It can produce more comprehensive and satisfactory results, especially for multi-objective tasks such as super resolution with color enhancement. It is very lightweight, but with some implementations can require fine-tuning. It can be used for any type of images, but it may not be optimal for single-objective tasks or tasks with conflicting objectives.

  7. GANLoss: This is a loss function that uses a generative adversarial network (GAN) to discriminate between the predicted and target images. It aims to fool the discriminator network into thinking that the predicted image is real and indistinguishable from the target image. It can produce more sharp and realistic results, especially for high-level tasks such as super resolution or style transfer. However, it is computationally expensive, requires careful design and training of the discriminator network, and may suffer from instability or mode collapse issues. It can be used for any type of images, but it may not be optimal for low-level tasks or tasks with limited data.

</details> <details> <summary>Loss information, where to aim to have your loss values:</summary>

Metrics:

Example:

</details> <br/>

:computer_mouse: Related Websites: